Course Outline
I. Introduction and preliminaries
1. Overview
- Making R more friendly, R and available GUIs
- Rstudio
- Related software and documentation
- R and statistics
- Using R interactively
- An introductory session
- Getting help with functions and features
- R commands, case sensitivity, etc.
- Recall and correction of previous commands
- Executing commands from or diverting output to a file
- Data permanency and removing objects
- Good programming practice: Self-contained scripts, good readability e.g. structured scripts, documentation, markdown
- installing packages; CRAN and Bioconductor
2. Reading data
- Txt files (read.delim)
- CSV files
3. Simple manipulations; numbers and vectors + arrays
- Vectors and assignment
- Vector arithmetic
- Generating regular sequences
- Logical vectors
- Missing values
- Character vectors
- Index vectors; selecting and modifying subsets of a data set
- Arrays
- Array indexing. Subsections of an array
- Index matrices
- The array() function + simple operations on arrays e.g. multiplication, transposition
- Other types of objects
4. Lists and data frames
- Lists
- Constructing and modifying lists
- Concatenating lists
- Data frames
- Making data frames
- Working with data frames
- Attaching arbitrary lists
- Managing the search path
5. Data manipulation
- Selecting, subsetting observations and variables
- Filtering, grouping
- Recoding, transformations
- Aggregation, combining data sets
- Forming partitioned matrices, cbind() and rbind()
- The concatenation function, (), with arrays
- Character manipulation, stringr package
- short intro into grep and regexpr
6. More on Reading data
- XLS, XLSX files
- readr and readxl packages
- SPSS, SAS, Stata,… and other formats data
- Exporting data to txt, csv and other formats
6. Grouping, loops and conditional execution
- Grouped expressions
- Control statements
- Conditional execution: if statements
- Repetitive execution: for loops, repeat and while
- intro into apply, lapply, sapply, tapply
7. Functions
- Creating functions
- Optional arguments and default values
- Variable number of arguments
- Scope and its consequences
8. Simple graphics in R
- Creating a Graph
- Density Plots
- Dot Plots
- Bar Plots
- Line Charts
- Pie Charts
- Boxplots
- Scatter Plots
- Combining Plots
II. Statistical analysis in R
1. Probability distributions
- R as a set of statistical tables
- Examining the distribution of a set of data
2. Testing of Hypotheses
- Tests about a Population Mean
- Likelihood Ratio Test
- One- and two-sample tests
- Chi-Square Goodness-of-Fit Test
- Kolmogorov-Smirnov One-Sample Statistic
- Wilcoxon Signed-Rank Test
- Two-Sample Test
- Wilcoxon Rank Sum Test
- Mann-Whitney Test
- Kolmogorov-Smirnov Test
3. Multiple Testing of Hypotheses
- Type I Error and FDR
- ROC curves and AUC
- Multiple Testing Procedures (BH, Bonferroni etc.)
4. Linear regression models
- Generic functions for extracting model information
- Updating fitted models
- Generalized linear models
- Families
- The glm() function
- Classification
- Logistic Regression
- Linear Discriminant Analysis
- Unsupervised learning
- Principal Components Analysis
- Clustering Methods(k-means, hierarchical clustering, k-medoids)
5. Survival analysis (survival package)
- Survival objects in r
- Kaplan-Meier estimate, log-rank test, parametric regression
- Confidence bands
- Censored (interval censored) data analysis
- Cox PH models, constant covariates
- Cox PH models, time-dependent covariates
- Simulation: Model comparison (Comparing regression models)
6. Analysis of Variance
- One-Way ANOVA
- Two-Way Classification of ANOVA
- MANOVA
III. Worked problems in bioinformatics
- Short introduction to limma package
- Microarray data analysis workflow
- Data download from GEO: https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE1397
- Data processing (QC, normalisation, differential expression)
- Volcano plot
- Custering examples + heatmaps
Testimonials
Modeling and how to fit the data to model
- USDA
I enjoyed the self-learning through exercises and the tips and shortcuts shared.
- Competition Bureau
We had many varying levels of skill in the class which created the need for more thorough explanations at times to ensure understanding. Pace and structure was generally pleasant.
Gary Munn - Vodacom
The trainer truly showed how powerful R is and why it is beneficial.
Vodacom
Hands on examples were the most helpful.
Sean Kaukas
Working with 1:1 with Gunnar.
Bryant Ives
he is patient
Abdul De kock - Vodacom
Overview and understanding how big the topic is
British American Shared Services Europe BAT GBS Finance, WER/Centre/EEMEA
his knowlede and practical exemples
Irina Tulgara
A lot of knowldege - theoretical and practical
Anna Alechno
Graphs in R :)))
Faculty of Economics and Business Zagreb
We gained some knowledge about NN in general, and what was the most interesting for me were the new types of NN that are popular nowadays.
Tea Poklepovic
new insights in deep machine learning
Josip Arneric
very tailored to needs
Yashan Wang
The subject matter and the pace were perfect.
Tim - Ottawa Research and Development Center, Science Technology Branch, Agriculture and Agri-Food Canada
The tutor, Mr. Michael Yan, interacted with the audience very well, the instruction was clear. The tutor also go extent to add more information based on the requests from the students during the training.
Ottawa Research and Development Center, Science Technology Branch, Agriculture and Agri-Food Canada
the introduction of new packages
Ottawa Research and Development Center, Science Technology Branch, Agriculture and Agri-Food Canada
Michael the trainer is very knowledgeable and skillful about the subject of Big Data and R. He is very flexible and quickly customize the training to meet clients' need. He is also very capable to solve technical and subject matter problems on the go. Fantastic and professional training!
Xiaoyuan Geng - Ottawa Research and Development Center, Science Technology Branch, Agriculture and Agri-Food Canada
Tomasz was engaging and always ready to answer our questions. This course wouldn't have worked without him.
Kelly Gale, Global Knowledge Network Training Ltd
The R-programming overview training is quite intensive but Tomasz is always helpful, energetic and up to date. On top of it, he is passionate about R. I would highly recommend his R sessions to anyone interested in R.
Luiza Panoschi - Kelly Gale, Global Knowledge Network Training Ltd
Very knowledgeable instructor and I was given the tools and confidence to start exploring R on my own.
Canada Revenue Agency
I really enjoyed the first day on navigating R-Studio and some of the shortcuts he provided though out other days.
Canada Revenue Agency
I really love the hands-on approach that Michael designed. It was very effective to have short PowerPoint lessons that explained concepts, and then be able to apply it in the virtual environment in R. I was also very excited to see how powerful a tool R can be, and am I looking forward to keep learning more in the future.
Environment and Climate Change Canada
Practice exercises were relevant and very helpful to reinforce the knowledge.
Andy Kwan - Environment and Climate Change Canada
Follow-along exercises after slide presentation kept engagement.
Robin White - Environment and Climate Change Canada
Michael was very knowledgeable and clear in his instruction of the training. Course was well structured to teach the desired subject as well as the right amount of room was left to adjust to fit our needs better. Over all, I am very happy with the course.
Brock Batey - Environment and Climate Change Canada
The effort made by the trainer to engage the audience, as well as time taken to prepare exercises.
Hollie Coe, Wild Bioscience
The exercises to test what I had learnt. It felt like problem solving and I felt a sense of pride when I could solve it.
Hollie Coe, Wild Bioscience
Lots of relevant examples and R markdown files, which I hope will be useful to refer to later.
Hollie Coe, Wild Bioscience
The tool was interesting and I see the use. I would like to learn about more about it.
- Teleperformance
new tool which is "R" and I find it interesting to know the existence of such tool for data analysis
Michael Lopez - Teleperformance
Exercises on time series modeling
Teleperformance
I get answers on all my questions.
Natalia Gladii
It was very informative and professionally held. Wojteks knowledge level was so advanced that he could basically answer any question and he was willing to put effort into fitting the training to my personal needs.
Sonja Steiner - BearingPoint GmbH
The trainer was so knowledgeable and included areas I was interested in
Mohamed Salama
Practical exercises with R were very helpful.
CEED Bulgaria
The exercises.
Elena Velkova - CEED Bulgaria
He was very informative and helpful.
Pratheep Ravy
The many practical examples / assignments that we went through were great. For me, I learn better by seeing examples and applying them elsewhere. The use of real data and applying what was taught against it was extremely valuable. Michaels PowerPoint presentations and his ability to work through each solution was invaluable.
- Trimac Management Services LP
Good detail on what R is used for and how to start using it right away
Hoss Shenassa - Trimac Management Services LP
The remote classroom setting worked very well
- Trimac Management Services LP
the matter was well presented and in an orderly manner.
Marylin Houle - Ivanhoe Cambridge
The flexible and friendly style. Learning exactly what was useful and relevant for me
Jenny Tickner
real life practical examples
Wioleta (Vicky) Celinska-Drozd
The patience of Kamil.
Laszlo Maros
First session. Very intensive and quick.
Digital Jersey
Kamil is very knowledgeable and nice person, I have learned from him a lot.
Aleksandra Szubert
Detailed and comprehensive instruction given by experienced and clearly knowledgeable expert on the subject.
Justin Roche
The way the trainer made complex subjects easy to understand.
Adam Drewry
learning how to use excel properly