This ten-session experiential learning workshop is designed for students/staff/postdocs across all disciplines who aim to develop foundational and advanced competencies in data analysis using Python and Artificial Intelligence (AI) tools. In an era where data is pivotal to research and innovation, this series empowers participants to harness the capabilities of Python—a versatile and widely adopted programming language—for effective data manipulation, insightful visualization, and robust statistical analysis (McKinney, 2023; VanderPlas, 2016). The curriculum progressively introduces core data science libraries like Pandas, NumPy, Matplotlib, and Scikit-learn, ensuring a solid understanding of the entire data analysis workflow.
Beyond traditional methods, the workshop delves into the transformative potential of AI, demystifying machine learning concepts and providing hands-on experience with predictive modeling. A unique aspect of this series is the integration of modern AI tools, including an introduction to leveraging Large Language Models (LLMs) to augment analytical tasks, such as data cleaning, insight generation, and even assisting in code development.
The interdisciplinary nature of these skills is emphasized throughout, with examples and use cases drawn from diverse fields such as the natural and social sciences, engineering, humanities, and health sciences. Whether analyzing experimental results, textual corpora, survey data, or sensor outputs, participants will find the acquired skills directly applicable to their research. Furthermore, the workshop will touch upon scientific outreach opportunities, enabling students to better communicate their data-driven findings to broader audiences and contribute to open science initiatives. This practical, self-paced series aims to equip graduate students with the essential toolkit to confidently tackle complex data challenges and enhance their research impact.
- Python for Data Analysis, 3rd Edition. Wes McKinney (O’Reilly), 2023.
- Python Data Science Handbook. Jake VanderPlas (O’Reilly), 2016.
- Prompt Engineering, Tyson Swetnam, 2025.
Upon completion of this ten-session workshop series, participants will be able to:
- Achieve Foundational Proficiency in Python: Master Python programming fundamentals relevant to data acquisition, cleaning, manipulation, and transformation using core libraries like Pandas and NumPy.
- Develop Data Visualization and EDA Skills: Create meaningful data visualizations and perform comprehensive Exploratory Data Analysis (EDA) to uncover patterns, anomalies, and insights within datasets using libraries such as Matplotlib and Seaborn.
- Understand and Apply Core AI/ML Concepts: Grasp the fundamental principles of Artificial Intelligence and Machine Learning, and implement basic supervised and unsupervised learning models using Scikit-learn for predictive tasks.
- Integrate AI Tools for Enhanced Analysis: Learn to utilize emerging AI tools, including an introduction to Large Language Models (LLMs), to assist and augment various stages of the data analysis workflow, from data preparation to insight generation and code assistance.
- Execute End-to-End Data Analysis Projects: Design and implement a complete data analysis project, demonstrating the ability to integrate Python scripting, data processing, machine learning, and AI-assisted techniques to address a defined problem and communicate results effectively.
RESOURCES AND NOTES:
- Register(?) to join in person or via Zoom.
- When: Tuesdays @ 2 PM
- Where: Weaver Science and Engineering Library, Room 212.
- Zoom:(?)
(Content schedule and content are subject to change).
Instructor: Carlos Lizárraga
Date | Session Title | Description | Materials | Code | YouTube |
---|---|---|---|---|---|
08/26 | Session 1: Python Kickstart for Data Analysts 🐍 | Python fundamentals, setting up the environment, and basic syntax essential for data tasks. | |||
09/02 | Session 2: Data Wrangling with Pandas & NumPy 🐼 | Mastering data manipulation and cleaning using Python's core data science libraries. | |||
09/09 | Session 3: Visualizing Insights: Matplotlib & Seaborn 📊 | Creating impactful data visualizations to uncover patterns and communicate findings. | |||
09/16 | Session 4: Unveiling Stories: Exploratory Data Analysis (EDA) Techniques 🔍 | Applying statistical and visual techniques to explore datasets and generate hypotheses. | |||
09/23 | Session 5: AI & Machine Learning Demystified: Core Concepts 🤖 | Understanding fundamental AI and ML concepts, terminology, and the machine learning workflow. | |||
09/30 | Session 6: Predictive Power: Hands-on Machine Learning with Scikit-learn ⚙️ | Implementing basic supervised and unsupervised learning models using Scikit-learn. | |||
10/07 | Session 7: Understanding Text: Python for Natural Language Processing (NLP) Basics 📖 | Introduction to text data processing, feature extraction, and simple NLP tasks. | |||
10/14 | Session 8: AI Augmentation: Using LLMs for Smarter Data Analysis & Code Generation 💡 | Exploring how Large Language Models (LLMs) can assist in data cleaning, generating insights, and even writing Python code for analysis. | |||
10/21 | Session 9: Capstone Project: Building an End-to-End AI Data Analysis Pipeline 🧩 | Integrating skills from previous sessions to complete a mini-project, from data ingestion to insight generation with an AI component. | |||
10/28 | Session 10: The AI Horizon: Advanced Techniques, Ethics, and Future of Data Analysis 🚀 | Discussing advanced AI topics (e.g., deep learning basics, model interpretability), ethical considerations in AI for data analysis, and emerging trends. |
Notes in the Wiki.
Do you find yourself encountering data science tools that your research needs, but are unsure how to get started? Curious about the latest tools for organizing, visualizing and understanding your dataset? Are you looking for a better theoretical understanding of key concepts in statistical analysis?
Join us for this beginner-friendly, concept-focused, and practical introduction to the theory and practice of data science, from start to finish! Sessions cover topics such as data wrangling, statistics, visualization, exploratory data analysis, time series analysis, machine learning, natural language processing, deep learning, prompt engineering, and AI tools. Enhance your capabilities and take your data science research to the next level!
RESOURCES AND NOTES:
- Data Science Essentials: From Jupyter to AI Tools
- Content schedule and content are subject to change.
Date | Topic |
---|---|
01/16 | Introduction to Jupyter Notebooks |
01/23 | Data Wrangling 101: Pandas in Action |
01/30 | A Probability & Statistics refresher |
02/06 | A Probability & Statistics refresher |
02/13 | Data Visualization Libraries: Matplotlib |
02/20 | Data Visualization Libraries: Seaborn |
02/27 | Exploratory Data Analysis |
03/05 | Spring break |
03/12 | Time Series Analysis |
03/19 | Time Series Forecasting |
03/26 | Machine Learning with Scikit-Learn |
04/02 | Natural Language Processing |
04/09 | Deep Learning |
04/16 | Prompt Engineering |
04/23 | AI Tools Landscape |
Updated: 05/28/2025 (C. Lizárraga)
UArizona Data Lab, Data Science Institute, University of Arizona.