Lab 6: Lab Exam Trial#
Next week (Monday, March 2th) we’ll be doing the lab component of the midterm, worth 5-10% of your final grade (whichever gives you a higher overall score with the written midterm at 15-20%). To test out the setup in a low stakes environment, this lab will be conducted in class with the following websites whitelisted:
The usual lab computer environments will be available - VS Code, Jupyter Notebooks, Spyder, etc. There is currently an issue with Python’s intellisense extension on VS Code (Pylance), but hopefully this will be resolved before the real thing. In the meantime, enjoy the challenge of writing code without autocomplete.
The task#
If you missed the exercise or just want to take another look, you can now download the OKCupid dataset and starter notebook.
For both this trial run and the real thing, I will provide you with a starting notebook and csv distributed through the “I” drive (hopefully it is ready in time) directly to your Desktop. It will be a familiar dataset.
You will be asked to:
- Load some data from a csv
- Do a little bit of data exploration
- Split into train/test
- Implement some components of a preprocessing pipeline, such as feature encoding or transformations
You may bring a handwritten cheat sheet to this trial run and add to it while you work through the problem, then bring it to the “real thing” on March 6th.
Submission#
We’re still working out some issues with the I drive, but hopefully there will be a submission folder where you can copy paste your notebook file. Please name your notebook after yourself (e.g. mine would be charlotte.ipynb). You will not be able to see other submissions, nor update your own, but you can add a new one (e.g. charlotte_v2.ipynb) and I’ll only look at the most recent submission.
Turns out I can yank your files directly from your desktop using NetSupport.
The grade for the trial run will be pass/fail (as usual), while I will be evaluating the “real thing” based on your choices.