
IndyPy: The Risks of Code-Writing ML Models
The August 2022 edition of IndyPy — Indiana’s largest Python meetup founded in 2007 by Six Feet Up CTO and Amazon Web Services (AWS) Community Hero, Calvin Hendryx-Parker — featured an analysis of some of the risks involved with the emerging field of code-writing machine learning models.
In his presentation, Nick Doiron, a software engineer at Determined AI, discusses Large Language Models (LLM), which process large amounts of text and data to create a model of a given language. Such models have resulted in products such as GitHub Copilot, an AI model that GitHub claims suggests 40% of code in projects where it’s used.
While code-writing ML models can be very helpful in helping programmers solve problems, Doiron says there are some factors to keep in mind. In his presentation, Doiron discusses:
- some of the pitfalls to text-based LLMs such as deepfake text and biases,
- previously reported issues with code-writing models,
- how LLMs can be trained to understand a given language,
- tasks that LLMs are typically trained for; and
- broader questions about code-writing ML, such as whether or not it could be plagiarism.
Watch the Presentation
Did you miss the presentation? Watch the recording and explore tidbits via @IndyPy’s live Twitter thread.
Links and Resources
You can find Doiron on:
- LinkedIn: https://www.linkedin.com/in/nickdoiron/
- Twitter: https://twitter.com/mapmeld?ref_src=twsrc%5Egoogle%7Ctwcamp%5Eserp%7Ctwgr%5Eauthor
- GitHub: https://github.com/mapmeld
Tools to get started with ML models:
Thanks for filling out the form! A Six Feet Up representative will be in contact with you soon.