Andrew Friedman
afriedman412 [at] gmail [dot] com
Home • Work • Projects • Open Source • Writing • ContentWORK
Data Scientist
1-1-2019 to present
Sludge
Build, deploy and maintain data pipelines and web applications to supply journalists with the most recent data on topics such as congressional investments and PAC expenditures.
A few of the stories which utilized data from the app:
- The Members of Congress Who Profit From War
- Members of Congress Own Up to $93 Million in Fossil Fuel Stocks
- Revealed: how US senators invest in firms they are supposed to regulate
- Reps Questioning Megabank CEOs Own Stock in Their Companies
Data Scientist (Contract)
1-1-2023 to 6-1-2023
Center for Just Journalism
In partnership with the NYU Wagner School of Public Service, I worked with graduate students to investigate American newspapers' reliance on police sources when reporting on crime, and how that affected coverage of both crime and police. While the students conducted an in-depth analysis of a representative 300 article sample, I used a programmatic approach to analyze the full 100,000 article data set.
KEY RESPONSIBILITIES:
- Development of a standalone Python package for quote identification, attribution and resolution
- Acquisition and processing of 100,000 articles
- Lexis/Nexis query optimization to minimize irrelevant or off-topic articles
- Topic modeling to verify the fidelity of the students' sub-sample
Data Lead
6-1-2022 to 6-2-2023
Google/Medill Data-Driven Reporting Project
Member of a team awarded a grant from the Google/Medill Data-Driven Reporting Project to study 30 years of detailed crime statistics obtained from the Baltimore Police Department.
Results of the study are being published as a multipart series in The Real News:
- Part 1: Baltimore's Crime Numbers Game
- Part 2: The Short History and Long Tail of Baltimore’s “Zero Tolerance” Policing
- Part 3: An Audit of Baltimore City's Data Integrity
- Part 4: An Evaluation of City Budget and Health Metrics
Data Scientist
11-1-2018 to 8-1-2022
Chatdesk
Chatdesk is a Series A company backed by leading Silicon Valley investors like Menlo Ventures, Susa Ventures and Slow Ventures in the customer service space, whose customers include leading brands like Grubhub, BarkBox, Thinx, and OLAPLEX.
KEY RESPONSIBILITIES:
- Deployed message classification model for 1 million weekly messages with 99.5% accuracy
- Implemented Named Entity Recognition to increase flexibility of cleaning code
- Developed and maintained code base for cleaning and standardization of incoming messages for downstream processing for 100+ companies in 10+ languages, from diverse sources (Zendesk, Salesforce, Intercom, Facebook, Instagram)
- Attended client meetings for technical integration and conduct data analysis to help our sales team
Instructional Associate
9-1-2018 to 5-1-2020
General Assembly
Train students in data science methods, concepts and technologies, including: bash, python, data mining, supervised and unsupervised learning techniques, model building, forecasting, SQL, AWS and NLP.