Part A — Python programming and scientific libraries
- Introduction to the Python language:
- The formal syntax and semantics of the language
- The software platform, including user interfaces (the interpreter) and the Python standard library
- Basic datatypes: numbers (Booleans, integers, rationals) and text (strings)
- Collections: lists, tuples, and dictionaries (maps)
- Complex statements (conditional and iterative code), functions, and modules
- Introduction to the external libraries that form the backbone of the data analysis pipeline in Python:
- The Pandas table processing library, for reading, writing, and analyzing tabular data, with native support for statistical computing and summarization
- The Numpy numerical library, with support for a multitude of common operations: linear algebra, random number generation
- The Matplotlib library, for visualizing richly annotated plots
- (For QCB): The BioPython library, for loading and manipulating a variety of standard biological data formats, including sequence data and structural protein annotations
- (For DataScience): Additional libraries like Beatiful Soap, etc.
Part B — Algorithm and Data Structures
- Introduction to algorithms and algorithm analysis.
- Algorithmic complexity.
- Data structures
- High level overview
- Sequences, maps (ordered/unordered), sets
- Data structure implementations in Python
- Trees
- Definition of the tree data structures
- Visits
- Graphs
- Definition of the graph data structures
- Visits
- Algorithms on graphs
- Algorithmic techniques
- Divide-et-impera
- Dynamic programming
- Greedy
- Backtrack
- Brief introduction of the NP class