My PLoS-ONE Academic Editor Activity Summary

I recently received an email from PLoS-ONE summarizing my editorial activity for the first half of 2014. That’s a good thing: for one, I’m terrible at keeping track of all my service activities, and this helps in keeping them straight for my own annual activities report for my university. Second, I can see how I fare vs. other editors, and how well I do generally. Looking at the report below, seems like I’m less active in the number of submissions than average I processed in this period. I am pretty efficient in finding reviewers, and seems like I end up  accepting all papers! (although those were only 2 papers at this time).  Time on my desk is about average, although I am slower in the time taken from when the revision is returned to when I make a decision. That’s probably because, unlike when I accept a paper to edit I know I have some spare time, revisions tend to come, frustratingly, when I’m trying to beat a grant deadline, during busy course exam periods, or once during my (rare) vacation time.

Overall, this is a great service PLoS-ONE provides its editors. Good work, Damian Pattinson and the rest!

 

Editor: Iddo Friedberg, Academic Editor

Time Period: January 1 – June 30, 2014

Join Date: 3/13/2009

My Metrics Board Avg (mean) Quartile
Number of final decisions: 2 5.78 4
Monthly avg: 0.33 .98  -
Number of first decisions: 1 5.79 4
Number currently out for revision: 1 1.34  -
Total decisions: 4 11.26 3
My Metrics Board Avg (mean) Quartile
Acceptance rate: 100% 66%  -
Efficiency of reviewer selection: 71% 44% 1
My Metrics My Median Board Avg (mean) Board Avg (median) Quartile
Avg total editorial time: 100 100 97.69 80 3
Avg time on my desk: 39.50 39.50 23.4 16 4
Percent my time to total editorial time: 39.50%  - 25% 21.63% 4
Avg time to 1st decision: 29 29 32.93 28 2
Avg time from first revision returned to final decision: 32 32 34 21 3

Definitions and abbreviations:

Quartiles: Quartiles (Q1-Q4) were calculated using standard descriptive statistics on the data set for each metric. Here, Q1 = 75-100%; Q2 = 50-75%; Q3 = 25-50%; and Q4 = 0-25%.

Number of final decisions: Total number of Academic Editor final decisions (accept or reject) within the defined time period.
Average per month: “Number of final decisions” (above), divided by the number of months covered.
Minimum: 12 final decisions per year or 6 decisions over the course of six months. Standard: 2-3 final decisions per month.
Number of first decisions: Total number of Academic Editor first decisions within the defined time period, excluding first decisions to accept or reject.

Number currently out for revision: Number of submissions with Academic Editor revision decisions awaiting author resubmission at the time when the data is retrieved.
Total decisions rendered: Total number of Academic Editor final decisions of any kind (accept, reject, major revision, minor revision) within the defined time period.

Acceptance rate: The total number of accepts divided by the total number of final decisions. PLOS ONE’s historical average has been around 70%. Very high or low acceptance rates may indicate decision-making standards not in keeping with the PLOS ONE publication criteria.
Efficiency of reviewer selection: The total number of reviews completed divided by the number of invitations.Standard: 5 invitations issued; 2 completed reviews returned.

Total editorial time. Average (mean and median) total editorial time in calendar days that each manuscript spends in the editorial process from Academic Editor assignment to final decision.
Time on my desk: “Total editorial time” (above), excluding time with the authors, journal office and reviewers, when time with reviewers is defined as the period between the first reviewer accepting the invitation and the final review being returned.
Percent of my time to total editorial time: The time on the editor’s desk divided by the total editorial time, shown as a percentage.

Time to 1st decision: Average (mean and median) total editorial time in calendar days that each manuscript spends in the editorial process from Academic Editor assignment to first decision.
Avg time from 1st revision returned to final decision: Average (mean and median) time in calendar days from a manuscript’s being returned to the Academic Editor following an initial revision, to the final decision being reached, including any subsequent review and author revision periods that may occur.

PLOS ONE Editorial Board
Share and Enjoy:
  • Fark
  • Digg
  • Technorati
  • del.icio.us
  • StumbleUpon
  • Facebook
  • Reddit
  • Twitter
  • FriendFeed
  • PDF
  • email
  • Print
  • Google Bookmarks

Mozilla does scientific matchmaking between programmers and researchers

Mozilla Science Labs are looking top pair programmers and scientists. If you are a scientist in need of a programmer,  read the following, and then go to the website to see how to take it further. Thanks to Miami University’s Office for Advancement of Research and Scholarship  for bringing this to my attention.

 

Interdisciplinary Programming is looking for research projects to participate in a pilot study on bringing together the scientific and developer communities to work together on common problems to help further science on the web.  This pilot will be run with the Mozilla Science Lab as a means of testing out new ways for the open science and open source community to get their hands dirty and contribute. The pilot is open to coders both within the research enterprise as well as those outside, and for all skill levels.

In this study, we’ll work to break accepted projects down to digestible tasks (think bug reports or github issues) for others to contribute to or offer guidance on. Projects can be small to mid-scale – the key here is to show how we can involve the global research and development community in furthering science on the web, while testing what the right level of engagement is.  Any research-oriented software development project is eligible, with special consideration given to projects that further open, collaborative, reproducible research, and reusable tools and technology for open science.

Candidate research projects should:

  • Have a clearly stated and specific goal to achieve or problem to solve in software.
  • Be directly relevant to your ongoing or shortly upcoming research.
  • Require code that is sharable and reusable, with preference given to open source projects.
  • Science team should be prepared to communicate regularly with the software team.

Interdisciplinary Programming was the brainchild of Angelina Fabbro (Mozilla) and myself (Bill Mills, TRIUMF) that came about when we realized the rich opportunities for cross-pollination between the fields of software development and basic research.   When I was a doctoral student writing analysis software for the Large Hadron Collider’s ATLAS experiment, I got to participate in one of the most exciting experiments in physics today – which made it all the more heartbreaking to watch how much precious time vanished into struggling with unusable software, and how many opportunities for great ideas had to be abandoned while we wrestled with software problems that should have been helping us instead of holding us back.  If we could only capture some of the coding expertise that was out there, surely our grievously limited budgets and staff could reach far further, and do so much more.

Later, I had the great good fortune to be charged with building the user interface for TRIUMF’s upcoming GRIFFIN experiment, launching this month; thanks to Angelina, this was a watershed moment in realizing what research could do if it teamed up with the web.  Angelina taught me about the incredibly rich thought the web community had in the spheres of usability, interaction design, and user experience; even my amature first steps in this world allowed GRIFFIN to produce a powerful, elegant, web-based UI that was strides ahead of what we had before.  But what really struck me, was the incredible enthusiasm coders had for research.  Angelina and I spoke about our plans for Interdisciplinary Programming on the JavaScript conference circuit in late 2013, and the response was overwhelming; coders were keen to contribute ideas, participate in the discussion and even get their hands dirty with contributions to the fields that excited them; and if I could push GRIFFIN ahead just by having a peek at what web developers were doing, what could we achieve if we welcomed professional coders to the realm of research in numbers?  The moment is now to start studying what we can do together.

We’ll be posting projects in early July 2014, due to conclude no later than December 2014 (shorter projects also welcome); projects anticipated to fit this scope will be given priority.  In addition, the research teams should be prepared to answer a few short questions on how they feel the project is going every month or so.  Interested participants should send project details to the team at mills.wj@gmail.com by June 27, 2014.

 

Share and Enjoy:
  • Fark
  • Digg
  • Technorati
  • del.icio.us
  • StumbleUpon
  • Facebook
  • Reddit
  • Twitter
  • FriendFeed
  • PDF
  • email
  • Print
  • Google Bookmarks

Friday Odds and Ends

So things have been busy in non-blog land. Putting together a tenure packet, some travel, teaching, and oh yes, even science. So no insightful post here, just some odds and ends I collected, in no particular order:

  • There are quite a few species named after famous people: alive, dead, real or fictional.  Wikipedia has a list.  My favorite are a golden bottom horsefly named after Beyonce; a beetle, A. schwarzennegrinamed after “the actor, Arnold Schwarzenegger, in reference to the markedly developed (biceps-like) middle femora of the males of this species reminiscent of the actor’s physique. (paper)” and a Trilobite named after Han Solo.

Arnoldv2

  • Speaking of nomenclature, meet Boops boops.
  • The Pentagon has a contingency plan for the zombie apocalypse. I feel safer already.
    Monstra_promo1
  • Are the Steven Moffat episodes of Dr. Who more sexist than those written by Russel T. Davies? A BYU student attempts to answer this question. Nice infographic. One could argue that any TV show with a powerful male alien constantly demonstrating his superiority to his female sidekick  is somewhat inherently sexist, or at least can easily go that way regardless of script writer. But see also here.
Sexist?

Sexist?

  • This post articulates my sentiments on Why Python?
Share and Enjoy:
  • Fark
  • Digg
  • Technorati
  • del.icio.us
  • StumbleUpon
  • Facebook
  • Reddit
  • Twitter
  • FriendFeed
  • PDF
  • email
  • Print
  • Google Bookmarks

The genome of nerds

What makes a nerd a nerd? The stereotype is that of someone with a high intelligence, coupled with  social awkwardness and a wardrobe that may alert the fashion police. Now scientists think they may found the genomic links to these traits.

There was always a strong suspicion of a genetic component in people that are highly skilled in certain areas of engineering and sciences. Now we think that may be due to a particular type of viral infection.  We know that human endogenous retroviruses (HERVs) make up about 8% of the human genome (that’s more than our genes, really).  But what we don’t know is how they affect us, if at all. We think we do now. Specifically, a comprehensive study of human genomes from the 10,000 genome project has linked certain retroviral markers with education levels, certain vocations, and to a smaller extent, personal income. The result: programmers, engineers,  scientists (especially physicists, statisticians and mathematicians) all had specific HERV markers not found in the general populace. Some of these markers were located next to genes coding for proteins located in the frontal lobe: the brain area associated with problem-solving.

blood-blood-blood-VIRUS

Nerd carriers?

 

But even more so, the overall number of HERV markers those people  was considerably smaller: sometimes less than 4%, almost half of that of the general populace. Since HERV markers are generally associated with sexually transmitted viruses this finding led the researchers to hypothesize that the early hominid ancestors of the “nerd” populace tended to mate less than the general populace. Leading to fewer HERV markers, but somehow to a more specific selection for the “brainy” traits. This would also explain the stereotypical “bright but shy” nerd.

Really interesting study, and you can read more about it  here

 

 

Share and Enjoy:
  • Fark
  • Digg
  • Technorati
  • del.icio.us
  • StumbleUpon
  • Facebook
  • Reddit
  • Twitter
  • FriendFeed
  • PDF
  • email
  • Print
  • Google Bookmarks

Data of thousands of ALS patients made available for analysis

This came up in my inbox. An interesting and welcome initiative, making thousands of ALS patients’ medical data available for analysis.

It doesn’t seem to have any sequence data (so not a bioinformatic database), but there are heaps of biomedical data in which to sink your statistical teeth.

Dear All,

My name is Hagit Alon and I am a scientific officer at Prize4Life Israel.

Prize4Life is a non-profit organization that is dedicated to accelerating treatments and a cure for ALS (also known as motor neuron disease or Lou Gehrig’s disease).

Prize4Life was founded by an ALS patient, Avichai Kremer, and is active in Israel and in the US.

Prize4Life developed a unique resource for bioinformatics researchers: The Pooled Resource Open-access ALS Clinical Trials (PRO-ACT) database.

This open-access database contains over 8500 records of ALS patients from past Phase II and Phase III clinical trials, spanning on average a year or more of data.

The data within PRO-ACT includes demographic data, clinical assessments, vital signs, lab (blood and urine) data, and also survival and medical history information. It is by far the largest ALS clinical trials database ever created, and is in fact one of the largest databases of clinical trial information currently available for any disease.

Data mining of the PRO-ACT is expected to lead to the identification of disease biomarkers, provide insight into the natural history of disease, as well as insights into the design and interpretation of clinical trials, each of which would bring us closer to finding a cure and treatment for ALS. The PRO-ACT database has been recently relaunched with more standardized and research ready data.

Now we finally have the data that may hold the key. The only thing missing is you. The next ALS breakthrough can be yours….

The data is available for research here

Thanks,

 

Hagit Alon | Scientific Officer

Share and Enjoy:
  • Fark
  • Digg
  • Technorati
  • del.icio.us
  • StumbleUpon
  • Facebook
  • Reddit
  • Twitter
  • FriendFeed
  • PDF
  • email
  • Print
  • Google Bookmarks

Top 5 in Bioinformatics

I recently applied for a Moore Foundation grant in Data Science for the biological sciences. As part of the pre-application, I was asked to choose the top 5 works in data science in my field. Not so sure about data science, so I picked what I think are the most influential works in Bioinformatics, which is what my proposal was about. Anyhow, the choice was tough, and I came up with the following. The order in which I list the works is chronological, as I make no attempt to rank them. If you ask me in the comments “How could you choose X over Y?” my reply would probably be: “I didn’t”.

Dayhoff , M.O.,  Eck RV, and  Eck CM. 1972. A model of evolutionary change in proteins. Pp. 89-99 in Atlas of protein sequence and structure, vol. 5, National Biomedical Research Foundation, Washington D.C

Summary: this is the introduction of the PAM matrix, the paper that set the stage for our understanding of molecular evolution at the protein level, sequence alignment, and the BLASTing we all do. The question the asked: how can we quantify the changes between protein sequences? How can we develop a system that tells us, over time, the way proteins evolve? Dayhoff developed an elegant statistical method do so, which she named PAM, “Accepted Point Mutations”. She aligned hundreds of proteins and derived the frequency with which the different amino acids substitute each other. Dayhoff introduced a more robust version [PDF] in 1978, once the number of proteins she could use was enlarged for her to  count a large number of substitutions.

Altschul, S.F., Madden, T.L., Schaffer, A.A., Zhang, J., Zhang, Z., Miller, W. & Lipman, D.J. (1997) Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 25:3389-3402.

BLAST, Basic Local Alignment Search Tool is the go-to computational workhorse in molecular biology. It is the most cited paper in life sciences, so probably the most influential paper in biology today. For the uninitiated: BLAST allows you to take a sequence of protein or DNA, and quickly search for similar sequences in a database containing millions.  The search using one sequence takes seconds, or a few minutes at best. BLAST was actually introduced   in another paper in 1990. However, the heuristics developed here allowed for the gapped alignment of sequences, and for searching for sequences which are less similar, with statistical robustness. BLAST changed everything in molecular biology, and moved biology to the data-rich sciences. If ever there was a case for giving the Nobel in Physiology or Medicine to a computational person, BLAST is it.

Durbin R., Eddy S., Krogh A and Mitchison G Biological Sequence Analysis: Probabilistic Models of Proteins and Nucleic Acids Cambridge University Press 1998

The Moore Foundation solicitation asked for “works” rather than just “research papers”. If there is anything common to all bioinformatics labs, it’s this book. An overview of the basic sequence analysis methods. This books summarizes the pre-2000 foundation upon which almost all our knowledge is currently built: pairwise alignment, Markov Models, multiple sequence alignment, profiles, PSSMs, and phylogenetics.

Gene Ontology: tool for the unification of biology. The Gene Ontology Consortium (2000) Nature Genetics 25: 25-29

Not a research paper, and not a book, but a “commentary”. This work popularized to the use of ontologies in bioinformatics and cemented GO as the main ontology we use.

Pevzner PA, Tang H, Waterman MS. An Eulerian path approach to DNA fragment assemblyProc Natl Acad Sci USA. 2001 Aug 14;98(17):9748-53.

Sequence assembly using de-Bruijn graphs, making the assembly tractable for a large number of sequences. At the time, shotgun sequences produced by by Sanger sequencing could still be assembled in a finite time solving for a Hamiltonian path . Once next-generation sequencing data started pouring in, the use of de-Bruijn graphs and a Eulerian path became essential. For a great explanation of the methodological transition see this article in Nature Biotechnology

Yes, I know there are many deserving works not in here. When boiling down to five, the choice is almost arbitrary. If you feel offended that a work you like is not here, then I’m sorry.

 

 

Share and Enjoy:
  • Fark
  • Digg
  • Technorati
  • del.icio.us
  • StumbleUpon
  • Facebook
  • Reddit
  • Twitter
  • FriendFeed
  • PDF
  • email
  • Print
  • Google Bookmarks

Support Vector Machines explained well

 

Found this on Reddit r/machinelearning

(In related news, there’s a machine learning subreddit. Wow.)

Support Vector Machines (warning: Wikipedia dense article alert in previous link!) are learning models used for classification: which individuals in a population belong where? So… how do SVM and the mysterious “kernel” work?

The user curious_thoughts asked for an explanation of SVMs like s/he was a five year old. User copperking stepped up to the plate:

We have 2 colors of balls on the table that we want to separate.

svm1

We get a stick and put it on the table, this works pretty well right?

svm2

Some villain comes and places more balls on the table, it kind of works but one of the balls is on the wrong side and there is probably a better place to put the stick now.

svm3

SVMs try to put the stick in the best possible place by having as big a gap on either side of the stick as possible.

svm4

Now when the villain returns the stick is still in a pretty good spot.

svm5

There is another trick in the SVM toolbox that is even more important. Say the villain has seen how good you are with a stick so he gives you a new challenge.

svm6

There’s no stick in the world that will let you split those balls well, so what do you do? You flip the table of course! Throwing the balls into the air. Then, with your pro ninja skills, you grab a sheet of paper and slip it between the balls.

svm7

Now, looking at the balls from where the villain is standing, they balls will look split by some curvy line.

svm8

Boring adults the call balls data, the stick a classifier, the biggest gap trick optimization, call flipping the table kernelling and the piece of paper a hyperplane.

 

 

That was copperking’s explanation.

Related: Udi Aharoni created a video visualizing a polynomial kernel:

 

And, more recently, William Noble published a paper in Nature Biotechnology. You can access an expanded version here. Thanks to Mark Gerstein for tweeting this paper.

 

Happy kernelling!

 

Share and Enjoy:
  • Fark
  • Digg
  • Technorati
  • del.icio.us
  • StumbleUpon
  • Facebook
  • Reddit
  • Twitter
  • FriendFeed
  • PDF
  • email
  • Print
  • Google Bookmarks

Carnival of Evolution, February 2014 Edition

Wow, I haven’t posted anything in quite a while. Things are busy outside blogoland. But committing this blog to the February edition of the Carnival of Evolution just made me do it, so here goes. We’ll do this by scales, bottom up.

Molecular

Prions are the infective agents that cause transmissible spongiform encephalopathies such as Mad Cow Disease in, well, cows, and Kuru or Kreuzfeldt-Jakob disease in humans.  Apparently prions are subject to natural selection — evolution — and as the Lab Rat reports, no DNA is required.

800px-Prion_Replication

Fibril model of prion propagation. Source: wikipedia

Back to genomes, can some genomes evolve more slowly than others? Larry Moran tackles this question in Sandwalk.

Microbial

The E. coli long-term evolution experiment is an ongoing study in experimental evolution led by Richard Lenski that has been tracking genetic changes in 12 initially identical populations of asexual Escherichia coli bacteria since 24 February 1988. What have we learned? A meta-post linking to other posts summarizes five important things you can learn by looking at over 50,000 generations of bacterial evolution. Larry Moran discusses the unpredictability of evolution and potentiation in Lenski’s long-term evolution experiment.

 

800px-Lenski's_12_long-term_lines_of_E._coli_on_25_June_2008

The 12 evolving E. coli populations on June 25, 2008 Source: Wikipedia

Animal

A new book is out, The Monkey’s Voyage by Alan de Queiroz, and it is reviewed by Richard Conniff. How Did Monkeys Cross the Atlantic? A Near-Miraculous Answer was posted at strange behaviors. Speaking of monkeys, or rather apes, a comparative examination fo the chimp and human genomes reveal that 154 human genes have undergone positive selection compared with 233 chimp genes, after our phylogenetic split. Surprisingly, these are not the genes you may expect to have been selected as such.

From primates to canines, one dog has managed to outlive all others in its species… or its genes have. How? Read Carl Zimmer’s fascinating story on How A Dog Has Lived For Eleven Thousand Years posted at The Loom. In contrast, one species which is no longer with us is the Beelzebufu frog, also known as the Frog from Hell. Yes, this one ate dinosaurs, some 75 million years ago. Yikes.

As climate change continues to affect our world, species migrate and/or change phenotypes to adapt.  Or do they? Ben Haller recommends that you read Andrew Hendry’s post in Eco-Evo Evo-Eco to find out more.


Jump to 4:09 to see the Frog from Hell.

Mineral

How can you solve evolutionary problems with computers? A blog written by C. Titus Brown’s students explains evolutionary simulations and experiments in silico. While Bradly Alicea presents methods for Bet-hedging and Evolutionary Futures posted at Synthetic Daisies. A re-examination of Hamilton’s rule tells us why altruism is not only not rare as an evolutionary trait, it should probably be expected and quite frequent. Bjorn Ostman reports in Pleiotropy about Sewall Wright’s last paper on adaptive landscapes.

 

hedging-stocks-2

Bet-hedging as an investment strategy. Use a rowboat and a hang-glider.

While Titus’s students and others have been evolving things in computers, John Wilkins tackles the question whether life exists at all. No spoilers here, you will have to read it. You should probably also read Wilkins’s new book, on the Nature of Classification

 

That’s it! Thank you for being with us, a short post for a short month. Don’t forget to submit to the March carnival!

 

 

Share and Enjoy:
  • Fark
  • Digg
  • Technorati
  • del.icio.us
  • StumbleUpon
  • Facebook
  • Reddit
  • Twitter
  • FriendFeed
  • PDF
  • email
  • Print
  • Google Bookmarks

Science funding on other planets

Got this from a tweet by Casey Bergman

 

ipN0ExT

Share and Enjoy:
  • Fark
  • Digg
  • Technorati
  • del.icio.us
  • StumbleUpon
  • Facebook
  • Reddit
  • Twitter
  • FriendFeed
  • PDF
  • email
  • Print
  • Google Bookmarks

BOSC 2014 Guess the Keynote Competition

(From Peter Cock, via the OBF News Blog)

 

We’re pleased to officially confirm that one of the two keynote speakers
for the 15th annual Bioinformatics Open Source Conference (BOSC 2014) will
be C. Titus Brown, as he announced on Twitter recently:

Titus Brown (@ctitusbrown):
Excited to be a keynote speaker at BOSC 2014! My title:
“A History of Bioinformatics (in the year 2039)”
– plenty of room for mischief ;)
https://twitter.com/ctitusbrown/status/410934403565490176

In recognition of the growing use of Twitter and social media within science
as a way of connecting across geographical divides, we’re announcing a
Twitter competition to guess who is scheduled to give the second keynote
at BOSC 2014 in Boston.

To enter, please tweet using hashtag #bosc2014 and include us via @OBF_BOSC,
e.g.

I think @OBF_BOSC should invite “Professor X” to be a keynote speaker
at #BOSC2014 because…

The first correct entry (within one week) will be awarded one complementary
BOSC 2014 registration fee for themselves, or a nominated group member. This
does not cover travel or accommodation, and there is no cash substitute if you
cannot attend BOSC 2014. Members of the OBF board, BOSC organizing
committee, and ISMB SIG committee are not eligible, nor are the keynote
speakers themselves.

We intend to announce the mystery keynote speaker and any Twitter competition
winner in one week’s time, but reserve the right to cut short, modify, or
cancel the competition.

Our ulterior motive is to crowd source ideas for future keynote speakers in
BOSC 2015, so some serious suggestions please

Further details about BOSC 2014 will be posted here:
http://www.open-bio.org/wiki/BOSC_2014

Thank you,

Peter Cock & Nomi Harris, BOSC 2014 co-chairs.

This was also posted to the OBF News Blog,
http://news.open-bio.org/news/2013/12/bosc-2014-keynote-competition/

BOSC and the OBF are on Twitter as:
https://twitter.com/OBF_BOSC
https://twitter.com/OBF_news

Share and Enjoy:
  • Fark
  • Digg
  • Technorati
  • del.icio.us
  • StumbleUpon
  • Facebook
  • Reddit
  • Twitter
  • FriendFeed
  • PDF
  • email
  • Print
  • Google Bookmarks

Lord of the papers

Three figures from the undergrad who is always high
Seven tables from the lab tech with his heart of stone
Nine supplements from the postdocs, with careers doomed to die
One manuscript for the Editor on his dark throne
In the journal submission form, where the shadows lie
One paper to rule them all, one paper to find them
One paper to bring them in and in the darkness bind them
In the submission form, where the shadows lie

Share and Enjoy:
  • Fark
  • Digg
  • Technorati
  • del.icio.us
  • StumbleUpon
  • Facebook
  • Reddit
  • Twitter
  • FriendFeed
  • PDF
  • email
  • Print
  • Google Bookmarks

PhD position in Statistical Protein Structure Prediction, Copenhagen, Denmark

One of the major unsolved problems in bioinformatics is the protein folding problem: given an amino acid sequence, predict the overall three-dimensional structure of the corresponding protein. It has been known since the seminal work of Christian B. Anfinsen in the early seventies that the sequence of a protein encodes its structure, but the exact details of the encoding still remain elusive. Since the protein folding problem is of enormous practical, theoretical and medical importance – and in addition forms a fascinating intellectual challenge – it is often called the holy grail of bioinformatics.Currently, most protein structure prediction methods are based on rather ad hoc approaches. The aim of this project is to develop and implement a statistically rigorous method to predict the structure of proteins, building on various probabilistic models of protein structure developed by the Hamelryck group (see Bibliography). The method will also take the dynamic nature of proteins into account.

Bibliography:

Boomsma, W., Mardia, KV., Taylor, CC., Ferkinghoff-Borg, J., Krogh, A. and Hamelryck, T. (2008) A generative, probabilistic model of local protein structure. Proc. Natl. Acad. Sci. USA, 105, 8932-8937
Mardia, KV., Kent, JT., Zhang, Z., Taylor, C., Hamelryck, T. (2012) Mixtures of concentrated multivariate sine distributions with applications to bioinformatics. J. Appl. Stat. 39, 2475-2492.
Boomsma, W., Frellsen, J., Harder, T., Bottaro, S., Johansson, KE., Tian, P., Stovgaard, K., Andreetta, C., Olsson, S., Valentin, J., Antonov, L., Christensen, A., Borg, M., Jensen, J., Lindorff-Larsen, K., Ferkinghoff-Borg, J., Hamelryck, T. (2013) PHAISTOS: A framework for Markov chain Monte Carlo simulation and inference of protein structure. J. Comput. Chem. 34, 1697-705
Hamelryck, T., Mardia, KV., Ferkinghoff-Borg, J., Editors. (2012) Bayesian methods in structural bioinformatics. Book in the Springer series “Statistics for biology and health”, 385 pages, 13 chapters. Springer Verlag, March, 2012
Valentin, J., Andreetta, C., Boomsma, W., Bottaro, S., Ferkinghoff-Borg, J., Frellsen, J., Mardia, KV, Tian, P., Hamelryck, T. (2013) Formulation of probabilistic models of protein structure in atomic detail using the reference ratio method. Proteins. Accepted.

Requirements: Knowledge of statistics, machine learning and programming (C++ or equivalent). Knowledge of biology or biophysics is a plus but not a requirement.
Place of enrollment: Department of Biology, Bioinformatics Center
Supervisor: Assoc. Prof. Thomas Hamelryck
Co-supervisor: Prof. Michael Sørensen from Department of Mathematical Sciences

Apply here: http://dsin.ku.dk/positions/

Share and Enjoy:
  • Fark
  • Digg
  • Technorati
  • del.icio.us
  • StumbleUpon
  • Facebook
  • Reddit
  • Twitter
  • FriendFeed
  • PDF
  • email
  • Print
  • Google Bookmarks

Your Genome, Yourself?

We have Palaeolithic emotions, medieval institutions and God-like technologies.

E.O. Wilson

 

Whole genome sequencing will soon be cheap enough to be widely affordable. We are nearing the time when omics data may be retained for patients on a wide basis. These may include full exome, haplotype, full genome sequencing, tissue level transcriptomic data, microbiome and meta-transcrimtome, metabolome… the sky, or rather personal healthcare budget, is the limit.

Personal genomic advocates today present sequencing as a personal choice. When faced with concerns such as privacy, the general response from personalized medicine advocates is that the benefits outweigh privacy concerns, and in any case the people making the choice to sequence their genomes (at least now), are making an informed personal choice. This means that whatever possible detriment that may ensue from sequencing the genome will only affect that person. So, for example, denial of life insurance due to a genomic findings (legal in many countries) will only affect the person having their genome sequenced. In the US, where healthcare is privatized, denial of health insurance due to genetics is illegal, but the application of higher premiums is a concern. For example, some health insurance providers in the US charge higher premiums if the insured has two X chromosomes,  although you usually don’t need full-genome sequencing to determine that genotype. Other privacy concerns may include the leaking (via legal or illegal means) of genomic information to various entities you may not want to have your DNA data.

Trouble is, getting your genome sequenced  is not solely a personal choice: a person’s genomic information contains that of their family as well. So by having your own ‘omic information stored, you are making a choice for your siblings, parents, and children (including those yet unborn). So you are making a choice for them to know, or at least suspect, that they have certain genotypes they may or may not wish to know about. Michael Snyder, one of the strongest advocates of personal genomics has a habit of saying: “don’t sequence your genome if you are a worrier”. You may not be, but your unborn daughter may be. You may be able to correctly interpret the probabilistic data your genome provides, but your son or brother may not.

Or they may just be a private persons who would not want their genomic information out there, even by proxy.  In realistic terms, by cross-referencing familial and genomic databases, your daughter may be denied certain health or life insurance coverage, based on a genotype an insurance company does not like: which may simply mean an over-interpretation of the limited predictive power of genomic data. By having your data accessible, some of her data are accessible as well, indirectly. No database is crack-proof, and re-identifying supposedly anonymous genomic data is surprisingly easy . Familial DNA matching, coupled with surreptitious collection of DNA is becoming common practice with law enforcement to generate suspect lists. As the availability of genomic data increases, so does the erosion of personal privacy.

This all sounds rather alarmist, counter-progressive, and may give me the appearance of a bit of a Luddite. Especially when coming from a genome scientist… What about the huge benefits that await us from personal genomics? Should privacy and unfounded (or well-founded) anxieties stand in the way of progress? My prediction: they probably won’t. As the cost of personal genomics decrease, and the benefits (currently somewhat hyped)  increase, genotyping may start to be mandated by healthcare providers, and perhaps even some employers. But revisit the motto of this post: should we not, at least, consider some of the implications of our choices upon others, if not ourselves, given our “paleolithic emotions and medieval institutions”?

 

http://blogs.cdc.gov/genomics/files/2011/08/woman_testtube2.jpg

Source: CDC (Public Domain) http://blogs.cdc.gov/genomics/2011/08/25/think-before-you-spit-do-personal-genomic-tests-improve-health/

 

 

 

Share and Enjoy:
  • Fark
  • Digg
  • Technorati
  • del.icio.us
  • StumbleUpon
  • Facebook
  • Reddit
  • Twitter
  • FriendFeed
  • PDF
  • email
  • Print
  • Google Bookmarks

Music Monday: I’ll see you in my dreams

Because.. Django reinhardt.

 

Share and Enjoy:
  • Fark
  • Digg
  • Technorati
  • del.icio.us
  • StumbleUpon
  • Facebook
  • Reddit
  • Twitter
  • FriendFeed
  • PDF
  • email
  • Print
  • Google Bookmarks

The Right to Read

Since this is Open Access Week, I thought I’d do the Open-Access / CC thing and share someone else’s work. In this case, a highly topical short story written by Richard Stallman.  The author also has a constantly updated page with comments on the restrictions placed today on sharing reading materials. As you will see, this story may be not too far-fetched as it first seems…

The Right to Read

The following article appeared in the February 1997 issue of Communications of the ACM (Volume 40, Number 2).

Copyright © 1996, 2002, 2007, 2009, 2010 Richard Stallman

Reproduced under CC-BY-Noderiv license

From The Road To Tycho, a collection of articles about the antecedents of the Lunarian Revolution, published in Luna City in 2096.

For Dan Halbert, the road to Tycho began in college—when Lissa Lenz asked to borrow his computer. Hers had broken down, and unless she could borrow another, she would fail her midterm project. There was no one she dared ask, except Dan.

This put Dan in a dilemma. He had to help her—but if he lent her his computer, she might read his books. Aside from the fact that you could go to prison for many years for letting someone else read your books, the very idea shocked him at first. Like everyone, he had been taught since elementary school that sharing books was nasty and wrong—something that only pirates would do.

Continue reading The Right to Read →

Share and Enjoy:
  • Fark
  • Digg
  • Technorati
  • del.icio.us
  • StumbleUpon
  • Facebook
  • Reddit
  • Twitter
  • FriendFeed
  • PDF
  • email
  • Print
  • Google Bookmarks