Web Scraping with Python Training Course
Web scraping is a method used to extract data from websites and store it in local files or databases.
This instructor-led training (delivered online or on-site) is designed for developers looking to automate the process of crawling multiple websites using Python, to gather data for further processing and analysis.
By the end of this course, participants will be able to:
- Set up Python along with all necessary packages.
- Extract and interpret data from various websites.
- Grasp how websites function and their HTML structure.
- Create web crawlers capable of large-scale operations.
- Leverage Selenium for scraping AJAX-driven web pages.
Course Format
- Engaging lectures and discussions.
- Extensive exercises and practice sessions.
- Practical implementation in a live-lab setting.
Customization Options for the Course
- This course presumes prior programming knowledge.
- To request tailored training, please contact us to make arrangements.
Course Outline
Introduction
Setting up the Development Environment
Python Primer: Data Structures, Conditionals, File Handling, etc.
Python Packages for Web Scraping: Scrapy and BeautifulSoup
How a Website Works
How HTML is Structured
Making a Web Request
Scraping an HTML Page
Working with XPath and CSS
Filtering Data Using Regular Expressions
Creating a Web Crawler
Crawling AJAX and JavaScript Pages with Selenium.
Web Scraping Best Practices
Troubleshooting
Summary and Conclusion
Requirements
- Programming experience, preferably in Python. If participants have programming experience in a language other than Python, the training can be extended to include more introductory Python exercises.
Audience
- Developers
Need help picking the right course?
Web Scraping with Python Training Course - Enquiry
Testimonials (1)
Many different examples and topics has been covered, from basic investigation to login management and dynamic page management.
Daniele Tagliaferro - Creditsafe Italia Srl
Course - Web Scraping with Python
Upcoming Courses
Related Courses
Scaling Data Analysis with Python and Dask
14 HoursThis instructor-led, live training in the UAE (online or onsite) is aimed at data scientists and software engineers who wish to use Dask with the Python ecosystem to build, scale, and analyze large datasets.
By the end of this training, participants will be able to:
- Set up the environment to start building big data processing with Dask and Python.
- Explore the features, libraries, tools, and APIs available in Dask.
- Understand how Dask accelerates parallel computing in Python.
- Learn how to scale the Python ecosystem (Numpy, SciPy, and Pandas) using Dask.
- Optimize the Dask environment to maintain high performance in handling large datasets.
Data Analysis with Python, Pandas and Numpy
14 HoursThis instructor-led, live training in the UAE (online or onsite) is aimed at intermediate-level Python developers and data analysts who wish to enhance their skills in data analysis and manipulation using Pandas and NumPy.
By the end of this training, participants will be able to:
- Set up a development environment that includes Python, Pandas, and NumPy.
- Create a data analysis application using Pandas and NumPy.
- Perform advanced data wrangling, sorting, and filtering operations.
- Conduct aggregate operations and analyze time series data.
- Visualize data using Matplotlib and other visualization libraries.
- Debug and optimize their data analysis code.
FARM (FastAPI, React, and MongoDB) Full Stack Development
14 HoursThis instructor-led, live training in (online or onsite) is aimed at developers who wish to use the FARM (FastAPI, React, and MongoDB) stack to build dynamic, high-performance, and scalable web applications.
By the end of this training, participants will be able to:
- Set up the necessary development environment that integrates FastAPI, React, and MongoDB.
- Understand the key concepts, features, and benefits of the FARM stack.
- Learn how to build REST APIs with FastAPI.
- Learn how to design interactive applications with React.
- Develop, test, and deploy applications (front end and back end) using the FARM stack.
Developing APIs with Python and FastAPI
14 HoursThis instructor-led, live training in the UAE (online or onsite) is aimed at developers who wish to use FastAPI with Python to build, test, and deploy RESTful APIs easier and faster.
By the end of this training, participants will be able to:
- Set up the necessary development environment to develop APIs with Python and FastAPI.
- Create APIs quicker and easier using the FastAPI library.
- Learn how to create data models and schemas based on Pydantic and OpenAPI.
- Connect APIs to a database using SQLAlchemy.
- Implement security and authentication in APIs using the FastAPI tools.
- Build container images and deploy web APIs to a cloud server.
Machine Learning with Python – 2 Days
14 HoursThis course aims to equip participants with foundational skills in utilizing Machine Learning techniques in real-world scenarios. By leveraging Python programming language and its diverse libraries, alongside numerous practical examples, learners will be guided on how to effectively use key components of Machine Learning, make informed data modeling choices, interpret algorithm outputs, and verify the accuracy of results.
Our objective is to empower you with the confidence to comprehend and apply essential Machine Learning tools while steering clear of typical missteps in Data Science implementations.
Machine Learning with Python – 4 Days
28 HoursThis course aims to enhance your practical proficiency in utilizing Machine Learning techniques. By leveraging the Python programming language and its extensive libraries, along with numerous real-world examples, you will learn to effectively use key components of Machine Learning, make informed data modeling choices, interpret algorithm outputs, and verify results.
Our objective is to equip you with the confidence to understand and apply essential Machine Learning tools while steering clear of typical Data Science challenges.
Accelerating Python Pandas Workflows with Modin
14 HoursThis instructor-led, live training in the UAE (online or onsite) is aimed at data scientists and developers who wish to use Modin to build and implement parallel computations with Pandas for faster data analysis.
By the end of this training, participants will be able to:
- Set up the necessary environment to start developing Pandas workflows at scale with Modin.
- Understand the features, architecture, and advantages of Modin.
- Know the differences between Modin, Dask, and Ray.
- Perform Pandas operations faster with Modin.
- Implement the entire Pandas API and functions.
Python for Natural Language Generation (NLG)
21 HoursIn this instructor-led, live training in the UAE, participants will learn how to use Python to produce high-quality natural language text by building their own NLG system from scratch. Case studies will also be examined and the relevant concepts will be applied to live lab projects for generating content.
By the end of this training, participants will be able to:
- Use NLG to automatically generate content for various industries, from journalism, to real estate, to weather and sports reporting.
- Select and organize source content, plan sentences, and prepare a system for automatic generation of original content.
- Understand the NLG pipeline and apply the right techniques at each stage.
- Understand the architecture of a Natural Language Generation (NLG) system.
- Implement the most suitable algorithms and models for analysis and ordering.
- Pull data from publicly available data sources as well as curated databases to use as material for generated text.
- Replace manual and laborious writing processes with computer-generated, automated content creation.
Advanced Machine Learning with Python
21 HoursIn this instructor-led, live training in the UAE, participants will learn the most relevant and cutting-edge machine learning techniques in Python as they build a series of demo applications involving image, music, text, and financial data.
By the end of this training, participants will be able to:
- Implement machine learning algorithms and techniques for solving complex problems.
- Apply deep learning and semi-supervised learning to applications involving image, music, text, and financial data.
- Push Python algorithms to their maximum potential.
- Use libraries and packages such as NumPy and Theano.
Python: Automate the Boring Stuff
14 HoursThis instructor-led, live training in the UAE is based on the popular book, "Automate the Boring Stuff with Python", by Al Sweigart. It is aimed at beginners and covers essential Python programming concepts through practical, hands-on exercises and discussions. The focus is on learning to write code to dramatically increase office productivity.
By the end of this training, participants will know how to program in Python and apply this new skill for:
- Automating tasks by writing simple Python programs.
- Writing programs that can do text pattern recognition with "regular expressions".
- Programmatically generating and updating Excel spreadsheets.
- Parsing PDFs and Word documents.
- Crawling web sites and pulling information from online sources.
- Writing programs that send out email notifications.
- Use Python's debugging tools to quickly resolve bugs.
- Programmatically controlling the mouse and keyboard to click and type for you.
Python Programming for Finance
35 HoursPython is a widely-used programming language in the financial sector, adopted by major investment banks and hedge funds for developing various financial applications, from core trading systems to risk management tools.
In this instructor-led live training session, participants will learn how to leverage Python to create practical solutions for specific finance-related challenges.
By the end of this course, participants will be able to:
- Grasp the basics of the Python programming language
- Download, install, and maintain the optimal development tools for building financial applications in Python
- Select and apply appropriate Python packages and techniques to manage, visualize, and analyze financial data from diverse sources (CSV, Excel, databases, web, etc.)
- Create applications that address issues related to asset allocation, risk analysis, investment performance, and more
- Debug, integrate, deploy, and optimize a Python application
Audience
- Developers
- Analysts
- Quants
Course Format
- The course includes lectures, discussions, exercises, and extensive hands-on practice.
Note
- This training is designed to address key challenges faced by finance professionals. If you have a specific topic, tool, or technique that you would like to cover in more detail, please contact us to arrange for additional content.
Advanced Python - 4 Days
28 HoursThis instructor-led, live training in the UAE (online or onsite) is aimed at developers who wish to learn advanced Python programming techniques, including how to apply this versatile language to solve problems in areas such as distributed applications, data analysis and visualization, UI programming and maintenance scripting.
Python Programming - 4 days
28 HoursThis course is tailored for individuals eager to master the Python programming language. It focuses on the Python language itself, its core libraries, and the selection of the most beneficial libraries developed by the Python community. Python powers businesses globally and is widely used by scientists – it stands among the most favored programming languages.
The course utilizes the latest version of Python 3.x, with practical exercises that leverage its full capabilities. It can be conducted on any operating system, including various UNIX flavors such as Linux and Mac OS X, as well as Microsoft Windows.
Practical exercises make up approximately 70% of the course time, while demonstrations and presentations account for around 30%. Participants are encouraged to ask questions and engage in discussions throughout the course.
Note: The training can be customized to meet specific requirements upon prior request before the scheduled course date.
Test Automation with Selenium and Python
14 HoursSelenium is an open-source library designed for automating web application testing across various browsers. It interacts with the browser in the same way people do—by clicking links, filling out forms, and validating text. Selenium stands as the most widely used tool for web application test automation and is built on the WebDriver framework, offering robust bindings for multiple scripting languages, including Python.
During this instructor-led training, participants will leverage the capabilities of Python alongside Selenium to automate the testing of a sample web application. Through a combination of theory and practical exercises in a live lab setting, participants will acquire the skills necessary to automate their own web testing projects using Python and Selenium.
Course Format
- Interactive lecture and discussion.
- Numerous exercises and practice sessions.
- Hands-on implementation within a live-lab environment.
Customization Options for the Course
- To request a customized training session, please contact us to make arrangements.
Text Summarization with Python
14 HoursIn Python Machine Learning, the Text Summarization feature can read input text and generate a summary. This functionality is accessible via command-line or as a Python API/Library. One notable application is the quick creation of executive summaries; this is especially beneficial for organizations that need to analyze large volumes of textual data before preparing reports and presentations.
During this instructor-led, live training session, participants will learn how to use Python to develop a simple application that automatically generates a summary from input text.
By the end of this training, participants will be able to:
- Utilize a command-line tool for summarizing text.
- Create and design Text Summarization code using Python libraries.
- Evaluate three Python summarization libraries: sumy 0.7.0, pysummarization 1.0.4, readless 1.0.17
Audience
- Developers
- Data Scientists
Format of the course
- The course includes lectures, discussions, exercises, and extensive hands-on practice.