MATLAB Bioinformatics Toolbox User’s Guide
MathWorks
Getting Started
Bioinformatics Toolbox Product Description . 1-2
Key Features 1-2
Product Overview 1-3
Features 1-3
Expected Users 1-4
Data Formats and Databases 1-5
Sequence Alignments 1-7
Sequence Utilities and Statistics . 1-8
Protein Property Analysis . 1-9
Phylogenetic Analysis . 1-10
Microarray Data Analysis Tools . 1-11
Microarray Data Storage . 1-12
Mass Spectrometry Data Analysis . 1-13
Graph Theory Functions . 1-15
Statistical Learning and Visualization 1-16
Prototyping and Development Environment . 1-17
Data Visualization 1-18
Exchange Bioinformatics Data Between Excel and MATLAB 1-19
Using Excel and MATLAB Together . 1-19
About the Example . 1-19
Before Running the Example . 1-19
Running the Example for the Entire Data Set . 1-20
Editing Formulas to Run the Example on a Subset of the Data 1-21
Using the Spreadsheet Link product to Interact With the Data in MATLAB
Working with Whole Genome Data 1-25
Comparing Whole Genomes 1-32
v
ContentsHigh-Throughput Sequence Analysis
2
Work with Next-Generation Sequencing Data . 2-2
Overview . 2-2
What Files Can You Access? . 2-2
Before You Begin . 2-3
Create a BioIndexedFile Object to Access Your Source File 2-3
Determine the Number of Entries Indexed By a BioIndexedFile Object . 2-3
Retrieve Entries from Your Source File . 2-4
Read Entries from Your Source File . 2-4
Manage Sequence Read Data in Objects . 2-6
Overview . 2-6
Represent Sequence and Quality Data in a BioRead Object 2-7
Represent Sequence, Quality, and Alignment/Mapping Data in a BioMap
Object . 2-8
Retrieve Information from a BioRead or BioMap Object . 2-10
Set Information in a BioRead or BioMap Object . 2-12
Determine Coverage of a Reference Sequence 2-12
Construct Sequence Alignments to a Reference Sequence . 2-13
Filter Read Sequences Using SAM Flags . 2-14
Store and Manage Feature Annotations in Objects . 2-16
Represent Feature Annotations in a GFFAnnotation or GTFAnnotation
Object 2-16
Construct an Annotation Object . 2-16
Retrieve General Information from an Annotation Object 2-16
Access Data in an Annotation Object 2-17
Use Feature Annotations with Sequence Read Data 2-18
Bioinformatics Toolbox Software Support Packages 2-21
Install Support Package . 2-21
Available Support Packages 2-21
Count Features from NGS Reads 2-23
Identifying Differentially Expressed Genes from RNA-Seq Data . 2-32
Visualize NGS Data Using Genomics Viewer App . 2-52
Open the App . 2-52
Add Tracks by Importing Data 2-52
Visualize Single Nucleotide Variation in Cytochrome P450 . 2-53
Exploring Genome-Wide Differences in DNA Methylation Profiles . 2-58
Exploring Protein-DNA Binding Sites from Paired-End ChIP-Seq Data
. 2-79
Working with Illumina/Solexa Next-Generation Sequencing Data . 2-97
vi ContentsSequence Analysis
3
Exploring a Nucleotide Sequence Using Command Line 3-2
Overview of Example 3-2
Searching the Web for Sequence Information 3-2
Reading Sequence Information from the Web 3-4
Determining Nucleotide Composition 3-5
Determining Codon Composition . 3-8
Open Reading Frames 3-11
Amino Acid Conversion and Composition 3-13
Exploring a Nucleotide Sequence Using the Sequence Viewer App 3-15
Overview of the Sequence Viewer 3-15
Importing a Sequence into the Sequence Viewer 3-15
Viewing Nucleotide Sequence Information . 3-17
Searching for Words 3-19
Exploring Open Reading Frames . 3-22
Closing the Sequence Viewer . 3-25
Explore a Protein Sequence Using the Sequence Viewer App . 3-26
Overview of the Sequence Viewer 3-26
Viewing Amino Acid Sequence Statistics . 3-26
Closing the Sequence Viewer . 3-28
References . 3-29
Compare Sequences Using Sequence Alignment Algorithms . 3-30
View and Align Multiple Sequences 3-41
Overview of the Sequence Alignment App 3-41
Visualize Multiple Sequence Alignment 3-41
Adjust Sequence Alignments Manually 3-42
Rearrange Rows . 3-50
Generate Phylogenetic Tree from Aligned Sequences . 3-52
Analyzing Synonymous and Nonsynonymous Substitution Rates 3-55
Investigating the Bird Flu Virus . 3-65
Exploring Primer Design . 3-81
Identifying Over-Represented Regulatory Motifs . 3-91
Predicting and Visualizing the Secondary Structure of RNA Sequences
3-102
Using HMMs for Profile Analysis of a Protein Family 3-114
Predicting Protein Secondary Structure Using a Neural Network 3-131
Visualizing the Three-Dimensional Structure of a Molecule . 3-148
Calculating and Visualizing Sequence Statistics 3-163
viiAligning Pairs of Sequences . 3-177
Assessing the Significance of an Alignment 3-185
Using Scoring Matrices to Measure Evolutionary Distance 3-194
Calling Bioperl Functions from MATLAB 3-198
Accessing NCBI Entrez Databases with E-Utilities . 3-210
Microarray Analysis
4
Managing Gene Expression Data in Objects 4-2
Representing Expression Data Values in DataMatrix Objects 4-5
Overview of DataMatrix Objects 4-5
Constructing DataMatrix Objects . 4-5
Getting and Setting Properties of a DataMatrix Object . 4-6
Accessing Data in DataMatrix Objects . 4-6
Representing Expression Data Values in ExptData Objects 4-9
Overview of ExptData Objects 4-9
Constructing ExptData Objects . 4-9
Using Properties of an ExptData Object . 4-10
Using Methods of an ExptData Object . 4-10
References . 4-11
Representing Sample and Feature Metadata in MetaData Objects . 4-12
Overview of MetaData Objects 4-12
Constructing MetaData Objects . 4-13
Using Properties of a MetaData Object 4-15
Using Methods of a MetaData Object . 4-15
Representing Experiment Information in a MIAME Object . 4-16
Overview of MIAME Objects 4-16
Constructing MIAME Objects . 4-16
Using Properties of a MIAME Object . 4-17
Using Methods of a MIAME Object . 4-18
Representing All Data in an ExpressionSet Object 4-19
Overview of ExpressionSet Objects . 4-19
Constructing ExpressionSet Objects 4-20
Using Properties of an ExpressionSet Object . 4-21
Using Methods of an ExpressionSet Object . 4-21
Analyzing Illumina Bead Summary Gene Expression Data . 4-23
Detecting DNA Copy Number Alteration in Array-Based CGH Data 4-44
Analyzing Array-Based CGH Data Using Bayesian Hidden Markov
Modeling . 4-60
viii ContentsVisualizing Microarray Data 4-74
Gene Expression Profile Analysis 4-95
Working with Affymetrix Data . 4-112
Preprocessing Affymetrix Microarray Data at the Probe Level . 4-131
Analyzing Affymetrix SNP Arrays for DNA Copy Number Variants 4-142
Working with GEO Series Data . 4-162
Identifying Biomolecular Subgroups Using Attractor Metagenes 4-173
Working with the Clustergram Function . 4-185
Working with Objects for Microarray Experiment Data . 4-203
Phylogenetic Analysis
5
Using the Phylogenetic Tree App . 5-2
Overview of the Phylogenetic Tree App . 5-2
Opening the Phylogenetic Tree App . 5-2
File Menu . 5-3
Tools Menu . 5-11
Window Menu 5-17
Help Menu . 5-18
Building a Phylogenetic Tree for the Hominidae Species 5-19
Analyzing the Origin of the Human Immunodeficiency Virus . 5-25
Bootstrapping Phylogenetic Trees . 5-32
Mass Spectrometry and Bioanalytics
Preprocessing Raw Mass Spectrometry Data . 6-2
Visualizing and Preprocessing Hyphenated Mass Spectrometry Data Sets
for Metabolite and Protein/Peptide Profiling 6-19
Identifying Significant Features and Classifying Protein Profiles 6-38
Differential Analysis of Complex Protein and Metabolite Mixtures using
Liquid Chromatography/Mass Spectrometry (LC/MS) . 6-52
ixGenetic Algorithm Search for Features in Mass Spectrometry Data . 6-71
Batch Processing of Spectra Using Sequential and Parallel Computing
كلمة سر فك الضغط : books-world.net
The Unzip Password : books-world.net
تحميل
يجب عليك التسجيل في الموقع لكي تتمكن من التحميل
تسجيل | تسجيل الدخول