✨Keep it simple, make it scale. AI should be about empowering people, building understanding, & making dreams realities. 👩💻GenAI
@GoogleDeepMind
ex-
@GitHub
✨👩💻 Our
@DeepMind
Code AI team delivered a presentation this morning about the work we've done internally and externally—and the path for reinventing what it means to do software development and creative technical work in the age of generative models.
🤖 RL and generative…
mom: I'm riding in an Uber today
mom: and the driving directions are terrible
me: sorry, mom, that's too bad
mom: the driver says it is because of the algorithm
mom: but I told him my daughter works on algorithms for her job and would fix it for him
me: ...
mom: here is Mark
*standing in line for pho tonight, behind a dad who's trying to explain Alzheimer's to his son*
boy: "But grandma forgot me!"
dad: "Remember Dory? She forgot things, too, right?"
boy: "…yes"
dad: "But did that mean she loved Nemo any less?"
boy: *beaming* "no!"
🤯 Mind officially blown:
I recorded a screen capture of a task (looking for an apartment on Zillow). Gemini was able to generate Selenium code to replicate that task, and described everything I did step-by-step.
It even caught that my threshold was set to $3K, even though I…
O(1): random access to an element in a collection, dependent on indexing
O(n): list iterations
O(n^2): nested loops on the same collection
O(log n): divide and conquer
O(n log n): iterations that use divide and conquer
O(n!): adding a nested loop for every input you have
The team at
@Deezer
just released
#Spleeter
, a Python music source separation library with state-of-the-art pre-trained models! 🎶✨
Straight from command line, you can extract voice, piano, drums... from any music track! Uses
@TensorFlow
and
#Keras
.
🔫 Badass! A team at the
@MistralAI
hackathon in SF trained the 7B open-source model to play DOOM, based on an ASCII representation of the current frame in the game. 🤯
@ID_AA_Carmack
👩💻 Big news—today is my first day as the Director for Data Science & MLOps at
@github
.
Am excited to work with everyone in the open-source community to ensure that our platforms, developer tools, & ML systems offer a productive, delightful, collaborative, and safe experience. 🤗
"Our studies revealed that when solving a coding task with Copilot, programmers
may spend a large fraction of total session time (34.3%) on just double-checking and editing suggestions, and spend *more than half* of the task time on Copilot related activities, together
indicating…
There's been several good studies on how people actually use LLMs for coding tasks. This one presents an interesting, mixed-methods, deep-dive into the topic. From
@HsseinMzannar
@bansalg_
@adamfourney
and
@erichorvitz
; good list of relevant cites, too!
I'm absolutely in love with the mathematical and code annotations for this blog post on reinforcement learning! 😍✨
Building a Powerful DQN in
@TensorFlow
2.0 (explanation & tutorial) by Sebastian Theiler
Want to get started with open-source software, but not with code?
👩🎨 Design a logo
⁉️ Answer questions on
@StackOverflow
📒 Write docs!
👩🏫 Teach a course
📈 Triage issues, or help with project management
❤️ Post on social media
🐛 Submit issues
*All* contributions are valuable.
📊📒🤯 What in the world...
@ProjectJupyter
notebooks in
@MSExcel
:
"Use
#Excel
as an interactive playground for organizing and visualizing your data, seamlessly switching to Python for more sophisticated tools, such as machine learning or data munging."
Two new websites that I'll be keeping handy for the foreseeable future:
🙌 : translates natural language inputs into bash commands using
@TensorFlow
2.x
🐚 : beautiful, detailed manpage explanations for arbitrary shell commands
I don't want to be That Person,
@Netflix
, but the font that you have listed as an option for a chat program in 1994 wasn't released to the general public until 2007
✨🧠 The ecosystem that has grown up around
@TensorFlow
in the last few years blows my mind. There's just so much functionality, compared to some of the other, newer frameworks.
👉Consider this an ever-expanding thread for me to take notes + wrap my brain around products. Ready?
It is an unfortunate reality that academia does not consider software engineering or tool-building valid areas of academic work—but if libraries like ggplot2 and scikit-learn were referenced as often as they were used in research papers, that paradigm might shift.
Please cite.
🚀I am over-the-moon to announce that I will be joining
@Microsoft
's Developer Tools division as a Principal PM, starting next Tuesday.
This means getting to partner with teams like
@code
,
@Github
CodeSpaces,
#IntelliCode
, & more: bringing a data-driven approach to development.
@hardmaru
I wish they had also created a diverse dataset of rugs so that it didn’t confuse black stripes with cliffs and I could finally get my entire house cleaned 😂
friendly reminder that (a) high-res pictures of Han Solo exist on the internet; and (b) companies will put any high-res photo on a shower curtain and send it to you for $22
👩💻 Full disclosure: I detest whiteboard interviews.
But:
@Stanford
offers a course each year called "problem-solving for the CS Technical Interview", and if you want to pass whiteboard exams at FAANG companies, it is *the best* resource.
Link here: 👉
📢 IMPORTANT LIFE EVENT: ✨
Am delighted to announce that I've started a 100%, full-time rotation as product manager for
@TensorFlow
, focusing on
#S4TF
+ high-level APIs (particularly
#Keras
).
Please send me your feedback—and let's work together to make
@TensorFlow
even better!
Am very, very ready for the deep learning hype curve to start dying down. 😅
*So much* business value can be claimed with basic statistics, exploratory data analysis, and traditional machine learning–and those techniques require just a fraction of the time and complexity of DL.
"Now it is clear that anyone working with rocket fuels is outstandingly mad. I don’t mean garden-variety crazy or a merely raving lunatic. I mean a record-shattering exponent of far-out insanity." 🚀
Oh, rad: did you know that you can use functions as values in a Python dictionary? 🐍
You can define the lambda function as a reference to the keys of the dictionary, and then you can pass arguments to it and evaluate it.
😘 Unsolicited {dict} pic:
Favorite explanatory sentences:
—"Let's make this concrete with an example."
—"Let's unpack that description / definition."
—"Now, we're going to focus on building intuition."
Least favorite:
—"This exercise is left to the reader."
—"As you can see…"
—"As should be obvious…"
and one day I will tell my grandchildren how their grandmother unmounted, then remounted, a television in her hotel room just to connect an HDMI cable to her
@Raspberry_Pi
d'you remember the telephone game we played as kids?
everyone stood in a line - 1st person whispers a sentence to the 2nd person, who makes a minor change, etc.
by the end of the line, the sentence was mangled/unrecognizable
the grown-up version of this is called "Legacy Code"
I love machine learning, for a lot of reasons; but one of those reasons is *definitely* that a person can think:
"What would happen if I tried to bucketize the lyrics from 124,288 heavy metal songs into categories?"
and publish an academic paper on it.
I want to not be surprised if a software engineer or a data scientist used to be a biologist, or a bartender, or an aesthetician, or a geologist.
These skills don't require an ivory tower, or time sacrificed on acronyms for an advanced degree. Github or Gitlab should be your CV.
me: mom
me: i keep telling you
me: i cannot move to austin
mom: how many AI people can there be in california?
mom: just move them all here
me: ..
mom: y'all could all live together
mom: out in the hill country
me: ..
me: you want me to establish a texan deep learning commune
📊 "Computer programming textbooks and software documentation often contain flowcharts to illustrate the flow of an algorithm or procedure.
Modern OCR engines often tag these flowcharts as graphics and ignore them in further processing. In this paper, we work towards making…
🤯Wait... this can't be right. Right?🤯
Just tested creating a SQLite database on , using solely Pyodide; and ran a query against it with pandas.
All in the browser.
No VM, no cloud-hosted resources; just
@Github
and the WASM-y Python data science stack.
"Reading computer code *does not* activate the regions of the brain that are involved in language processing.
It activates a distributed network, which is also recruited for complex cognitive tasks such as solving math problems or crossword puzzles." 🧮🧩
✨
@Microsoft
's New Future of Work Report just dropped, and it's overwhelmingly focused on how AI impacts every aspect of information work. Adding some highlights below, but strongly recommend taking a look through the slides!
📓 TL;DR - information…
👩💻 What is a "data scientist" or "machine learning engineer", really?
📄:
Synthesizing responses from
@StackOverflow
, the
@PSF
Survey, the
@Kaggle
Survey, the
@AnacondaInc
survey, and more, I have taken a first stab at some common cohorts. Take a look!
✨BIG IMPORTANT NEWS✨
I will be transitioning to a new role at
@Microsoft
, working as a SWE with
@HaishiBai2010
and Yaron Schneider under the legendary
@MarkRussinovich
. 😁
Being an
@AzureAdvocates
has been the experience of a lifetime—and I'm excited for the new challenge! 🙌
So, time to drop some knowledge bombs. Most data scientists aren't taught:
- TCP/IP Protocol architectures
- how to deploy a server
- RESTful vs SOAP web services
- Linux command line tools
- the software development life cycle
- modular functions + the concept of writing tests
"This extension adds a quick command to search
@StackOverflow
without leaving
@Code
.
You can find the command by search, or by using the hotkeys cmd+h on Mac or ctrl+h on
@Windows
."
who built this & how can I buy you coffee
👩💻
⚙️
Am loving the simplicity of these static timeline plots, generated with Python from just a blob of JSON:
Do you have any favorite libraries for similar plots (with the caveat that I'd like to have something just as simple)?
Mine was to work on machine learning at
@Google
.
It's taken a while—and, on the way, I found a different job (and a family) that also turned out to be a dream. ✨
But: 11/27 will be my first day as a
@TensorFlow
Developer Advocate at
@GoogleAI
, working in Mountain View, CA. ❤️
Rant (again):
I'm a firm believer that if your students don't feel safe, supported, and confident enough to ask questions, then you aren't doing your job as a teacher. 🙅♀️
Education should be about building up the people around you; not elevating yourself or your ego. 👎
Y'all've probably already guessed, but I'm moving back to Central Texas, and have a couple of non-work goals (more RE: work soon):
1) Make Texas a leader in K-12 computer science education.
2) Increase the % of Texas community college students who take tech roles, post-college.
🙌 Update: today is my last day at
@Microsoft
/
@github
!
I'm so grateful for everything I've learned over the past year+, and for the partnerships with the
@code
,
@openai
,
@github
, &
@githubnext
teams.
Being a small part of projects like Codespaces + Copilot has been an honor. ❤️
Maybe this is naïve–but I don't think *anyone* wants to think, at the end of their career, "gee, sure helped lots of people click on ads!"; or "sure did make cameras effective at recognizing folks!"
Machine learning could be more– empowering, supporting, enabling.
It should be.
Automatic refactoring for Python code in notebooks (cleaning up & grouping code; transforming cells into testable functions)
Source code completion and docstring generation
and automatically unrolling notebook state, if you ran cells out of order. Nice!
Claude Shannon was an engineer, and took a philosophy course as an elective. It introduced him to an old system of logic showing that True&False statements could be encoded as 1s & 0s + solved like math problems.
This led to the development of binary code and information theory.
🐍
#Python
family: if you ever want to inspect the source code for a module that you've imported, but don't want to waste time hunting around for the .py file:
import inspect
import pprint
pprint.pprint( inspect.getsource( <module_name>))
really bummed that everyone seems to be working on AI-as-task-completer before AI-as-educator-&-explainer or AI-as-thought-partner
we should be using these tools to help folks learn to think empirically and to ask better questions, not to just outsource thinking to a new entity
<-- will die on a hill defending that:
—Docs are required for shipping a product.
—APIs must be designed for developers.
—Naming conventions should be consistent with your community.
—Know your community's tools, and write docs and interfaces focused on those integration points.
👩💻
@Google
's internal software systems have been capturing telemetry on literally *everything our engineers do*, for the last 25+ years...
...Which, stitched together, makes an awesome training dataset for a software engineering assistant. 😁
"DIDACT turns Google's software…
warning: very biased opinion
but, for developer tools: you don't need a "PM"
you need "an empathetic engineer, who is so frustrated with the experience of doing their work that they're willing to become a PM to change the state of the world for themselves and their colleagues"
✨👩💻📰 Someone give
#PapersWithCode
every single prize.
Displays every academic paper with code that has been open-sourced, ordered by
@Github
stars accumulated in the last three days.
⭐️ There are even auto-populated tags for conferences and data sets!
I've turned into someone who prefers strongly-typed programming languages and am not sure if Google did this to me or if I've always secretly been this way
✨💖 Ecstatic to *finally* be able to talk about Copilot: a collaboration between
@Github
,
@OpenAI
, and
@Microsoft
's Developer Tools division.
I've been using it for everything from FORTRAN to Markdown to Python for the last months, and it has made me *so much more* productive.
🤯 My new favorite hobby is opening a random repo from
@PapersWithCode
(for example, ); using
@code
's Python extension to extract classes into methods...
...and then asking
@GitHub
Copilot to (1) explain that new method, and (2) recommend a name for it. 🤯
"Data is immutable. Notebooks are for exploration and communication. Analysis is a DAG. Build from the environment up. Keep secrets and configuration out of version control. Be conservative in changing the default folder structure."
More of this, please:
Not looking for responses, just need to write and to be vulnerable for a tiny second: having a parent die, especially so quickly, is such a surreal feeling.
I keep trying to postpone thinking about it, just trying to accumulate a backlog of tasks (for work, for hobbies, for…