Magellan -Model COMPASS -Analysis of Large Biomolecular Datasets Software
We’re Changing What`s Possible in Mass Spec Research. Magellan designs software for analysis of large biomolecular datasets, particularly those derived from mass spectrometry. With technological leaps in mass-spec instrumentation, Magellan’s software provides the computational tools to rapidly analyze massive mass-spec data flows. Magellan’s analytics can consider the entire data output of the most advanced mass spec instruments. It allows comparative proteomics, lipidomics, and/or metabolomics experiments with thousands of patients. It will facilitate seamless integration of mass-spec datasets with sequencing-based datasets. What took months or years of data analysis can now be accomplished in days, even with limited prior mass spec experience.
Next-Gen Bioanalytics
Magellan`s AI software is solving the big-data problem in mass spec, allowing the discovery of previously hidden biology
Rapid, large-scale analysis of mass spec data makes different experimental approaches possible and increases the chances for novel discovery. Identify every mass spec features connected to a biological question first, then characterize their molecular identity. Analyze datasets large enough to detect relevant differences connected to the most subtle biology. Let Magellan’s COMPASS software transform your research...
Join our community and receive occasional updates as we progress toward launch!
Feature-first mass spec sacrifices molecular identity information for larger data scale and dataset reproducibility. It repeatedly analyzes the maximum number of species and their abundance, sometimes termed the MS1 analysis. MS1 data quality is increased and more individual measurements can be made as complex samples are passed through the mass spec.
In contrast to feature-first mass spec, data scale is sacrificed for detailed molecular identity information in the reported data. A single MS1 analysis is subjected to one of several algorithms, by which mass spec software picks a small subset of the detected molecules for secondary analysis. In this secondary, MS2 analysis, the individual species are fragmented and measured to reconstruct the molecular identity of the original molecule.
While MS2 data provides molecular identity, the selection of molecules for MS2 fragmentation can reduce data scale by orders of magnitude. Besides a reduction in data scale, the selection of molecules for MS2 analysis is not reproducible; the same sample can be analyzed twice and yield abundance information for different species each time. The result is significant gaps in the resulting dataset that make data analysis and cross-sample comparison difficult.
In contrast, feature-first mass spec considers entire mass spec datasets. Once biological connections of species are established, MS2 analysis can be targeted to molecules of definite biological interest and reveal their identity.
The limitation of feature-first mass spec has been availability of tools that work at very large data scales. High resolution mass spec instruments are capable of reporting abundance information for a million molecular species or more.
The obstacle to feature-first mass spec is not one of instrumentation: it is a data science problem. Comparing expression patterns runs afoul of the n2 problem: computers must consider as many values as the square of the number of data points. A million mass spec data points means considering a trillion values.
The n2 problem is exacerbated when the comparison must be made across large numbers of samples. If expression differences are to be assessed across 1,000 samples, the number of computed values increases accordingly. For 1 million data points per sample, a quadrillion values must be computed. This data scale exceeds the computational power of any supercomputer. Feature-first mass spec suffers from a data overload problem.
The result is that feature-first mass spec is attractive in theory, but difficult to realize in practice. Feature-first mas spec studies are often limited in scope, considering difference detected in relatively low numbers of sample. Even high impact publications in top journals seldom involve more than a few hundred sample numbers.
Magellan Bioanalytics’ software is purpose-built to handle feature-first mass spec datasets that contain a million or more individual data points and allows comparison of data profiles across thousands of samples or more. With a user-friendly interface, results from large and complex experiments can be extracted and visualized in hours. These tools make feature-first proteomics accessible to any life sciences research team and dramatically enhance the productivity of research groups dedicated to mass spec techniques.
Analyzing large biomolecular datasets is the future of biology. But integrating the data from different types of biomolecular datasets, particularly mass spec data with DNA sequencing-based data, is a challenge.
Magellan Bioanalytics can convert DNA sequencing-based information (FastQ files) into a data format that makes integrated analysis with mass spec and other quantitative data not only possible, but easy.
Currently, researchers perform mas spec and sequencing analysis separately using different tools and then compare the results. In some instances, results from analysis of one biomolecular dataset constrains the analysis of other bio-molecular datasets, introducing bias and reducing the amount of data analyzed.
Our tools allow for sequencing-based and mass spec datasets from the same samples to be analyzed as one. Biological connections between metabolomic, lipidomic, proteomic, and genomic datapoints can be rapidly identified.
The era of combinatorial analysis of large biomolecular datasets is here.
Today’s mass specs produce far more data than researchers can use. Experimental design is affected by this paradigm; massive amounts of data are discarded to create smaller, biased datasets that can be managed with limited computational tools.
Smaller, more manageable datasets result when researchers treat their samples in ways that select which molecules enter the mass spec or whether they allow mass spec software to select a subset of molecules for identification. Selection of molecules to be analyzed is independent of the experimental question and many molecules of potential interest are ignored.
DNA sequencing once suffered from this limitation. Limitations on sequencing capacity forced researchers to ask whether specific sequences were connected to biological processes. The arrival of next-generation sequencing technology allowed for genome-wide association studies (GWAS), where entire genomes could be analyzed to find small sequences linked to biological effects.
GWAS studies consider the biological question before the data is queried, eliminating biases and allowing unexpected discovery. Magellan Bioanalytics allows this same change in approach with mass spec data.
Biomarkers are molecules whose expression reflects biology. In an ideal world, every biological parameter could be captured by a single biomarker. We do not live in that world.
The connections between biomarkers and biological processes and systems are complex and so almost all biomarkers imperfectly reflect biology. For more subtle biological differences, the less information content of its associated individual biomarkers.
The answer is multivariate analysis. Information content from multiple biomarkers can be combined to increase confidence in the reflected connected biology. It is critical that combined biomarkers each contribute distinct information content about the system, as combining the same information content from multiple biomarkers does not increase total information.
Underlying these dynamics is biomarker scarcity. Biomarkers with high information content are rare. The ability to find biomarkers is directly correlated to the number of candidates considered. Larger candidate pools (greater data scale) will turn over more relevant biomarkers with unique information content.
This is why data scale matters. Data scale constrains the ability to find enough high information content biomarkers to distinguish between samples sets with different biology. Systems that rely on lower numbers of data points are fundamentally limited in their power.
Magellan’s software allows for the unbiased and rapid analysis of mass spec experiments that generate millions of data points per sample. This is the largest data scale in the industry, meaning it is poised to distinguish molecular details underpinning the most subtle biological ences.
Magellan`s COMPASS software opens new horizons of discovery for those who are…
Inexperienced with mass-spec experiments
Are you interested in probing your biological samples with mass spec-based approaches to develop key biological insights, but also been stymied by a lack of experience with mass-spec instrumentation?
COMPASS software makes analysis of mass-spec datasets rapid and user friendly, even to those that have no mass spec instrument time. With COMPASS software, you can send your samples for routine LC-MS analysis, upload the resulting data onto the COMPASS server, and extract the information hidden in all of that data, all without a doctorate in biochemistry.
Confident with mass-spec, but drowning in too much data
Have you been using mass-spec for years, diligently producing massive numbers of mass-spec output files only to become overwhelmed with the analysis of large experiments? Or, perhaps you are a mass-spec expert that performs experiments producing large of numbers of files, only to find you spend years of analysis time at a computer screen for every month at the bench.
COMPASS software is designed to rapidly generate insights from comparisons of results from thousands, even tens of thousands, of individual mass-spec files. Use COMPASS to make data analysis on your mass spec research quick and easy. And get back to doing experiments.
Frustrated by the limitations of existing computational tools
Are you frustrated that your mass spec experiments fail to yield clear results? Or are you disappointed knowing that you are sacrificing data for molecular identities? Perhaps you struggle knowing that analysis of a limited number well-characterized molecular species is unlikely to generate novel insights?
COMPASS software is designed for feature-first analysis of mass-spec data, analyzing millions of potential datapoints to identify those relevant to your biological question. Analysis of more data, of all of the data, enhances the opportunity for discovery and maximizes signals associated with important biology.
- Discover Novel Biology
- Stratify trial participants
- Develop Companion Dx
- Identify Drug Targets
NEW TO MASS SPEC
Are you interested in probing your biological samples with mass spec-based approaches to develop key biological insights, but also been stymied by a lack of experience with mass-spec instrumentation? COMPASS software makes analysis of mass-spec datasets rapid and user friendly, even to those that have no mass spec instrument time. With COMPASS software, you can send your samples for routine LC-MS analysis, upload the resulting data onto the COMPASS server, and extract the information hidden in all of that data, all without a doctorate in biochemistry.
Drowning in data
COMPASS software can make analyze and make sense of large MZML files...
Not finding answers in MS2
COMPASS software can identify what data is important for differentiating conditions or biology...