Here is one of the best books to improve your data skills.
But don't stop there!
Combine your feature engineering skills with automatic experiment tracking, and you'll 10x your ability to build machine learning models that work.
Start for free today:
Here is one of the best books to improve your data skills.
But don't stop there!
Combine your feature engineering skills with automatic experiment tracking, and you'll 10x your ability to build machine learning models that work.
Start for free today:
Distance metrics are critical for machine learning.
Many algorithms use distance metrics to calculate the similarity between two data points.
A few examples:
1. Euclidean distance
2. Manhattan distance
3. Minkowski distance
4. Hamming distance
A critical skill for every Python developer: Learn how to deal with data.
Here is a step-by-step example for you:
1. Grab data from multiple locations
2. Automatically load it into a spreadsheet
3. And do it all using Python
Here is the link:
Building Machine Learning Systems is a team effort.
If you want to build a team, here are the roles you need:
• AI Architect
• Data Scientist
• Data Engineer
• Machine Learning Engineer
Here is the responsibility of each of these roles:
Experiment Tracking is one of the simplest ways to improve your machine learning setup:
• Version your datasets
• Debug and reproduce your models
• Visualize performance across runs
• Collaborate with teammates
And here is the best part:
1 of 3
We talked to dozens of Computer Vision practitioners, and their feedback was consistent:
Visualizing data is painful during exploratory data analysis.
To solve this problem, we built and open-sourced Kangas.
And the community loves it!
How do you improve your
#DeepLearning
setup in two minutes?
Integrate your code with "Automatic Logging." A few lines of code, and you'll never go back.
Here is how it works:
1 of 5
It’s here! Our first open-source project is now live! Introducing Kangas, a smart data exploration and analysis tool for machine learning.
Dive into your data and analyze it like never before.
If you're looking for a summary of mathematics for
#MachineLearning
, this paper is a goldmine!
It's 100% free, courtesy of the University of Berkeley, and covers the following topics:
• Linear Algebra
• Calculus and Optimization
• Probabilities
There are 3 fundamental roles in a team dedicated to building a Machine Learning system.
But we usually only talk about Machine Learning Engineers.
Here are the roles you are missing and their responsibility:
1/4
💥 BIG ANNOUNCEMENT 💥
Today, we’re releasing our LLMOps course, taught by the fantastic
@omarsar0
+ sponsored by
@OpenAI
! We’ve worked with dozens of teams using LLMs + distilled their learnings into this course. Enroll for free now 👇
Sebastian Ruder (
@seb_ruder
), a research scientist
@_aylien
, reviews the history of NLP — from sequence-to-sequence learning to adversarial learning
See the full history here:
From this week's Comet Newsletter: Our friends over at
@huggingface
just released a new, completely free video course on transformers and other key
#NLP
topics.
Most data scientists focus on training models.
That's great, but it's only 5% of the work! Building end-to-end systems is hard.
Here are 30 requirements for an MLOps environment.
Link:
Multimodal datasets are becoming more common in deep learning.
Inspired by a seminar at LMU Munich, a team of students and researchers set out to collate existing knowledge and burgeoning approaches. Read more:
Comet +
@Gradio
= the power to run and test your models directly within your
#ML
experimentation pipeline. Create powerful model-based GUIs/demos, and include them in your Comet dashboard.
🛠️ Colab Notebook:
📙 Learn more:
Recommender systems are one of the most impactful ML models in production. They influence our everyday decisions, from what news to read to what shoes to buy. Unfortunately, evaluating the performance of these models isn't easy. This is where RecList comes in...
1/4
Data scientists struggle to select the best Machine Learning tool for their project.
With so many options, deciding becomes overwhelming, and the wrong choice can kill your project.
Here are 9 of the most popular Machine Learning tools in the industry right now:
Introducing a brand new suite of Prompt Engineering tools from Comet! With advanced capabilities like Prompt Playground, Prompt History, Prompt Usage Tracker & seamless integration with OpenAI & LangChain, streamlining LLM workflow was never this easy.
What Is the Machine Learning Lifecycle?
There’s no one formula for developing models, but most projects follow standard steps.
Here are the top 10 tools to optimize the ML lifecycle.
With the new
#StyleGAN3
released, we thought it'd be helpful to have a way to visualize training run outputs—images & videos, but also metrics, hyperparameters, etc.
@DN06_90
& Michael Cullan put this
#Colab
notebook together that does just that
👉
RSVP for our meetup tomorrow with
@GuggerSylvain
, who will cover an infinitely customizable training loop using
@fastdotai
— food, beverages, and networking w/ data scientist included!
Co-hosted with NYC
@PyTorch
🤓
If you haven't had the chance to start building
#NLP
Models, we suggest getting started using this awesome Sentiment Analysis tutorial. You'll learn how to create BERT models using the
@huggingface
package.
This
#BlackHistoryMonth
we'll be highlighting some great Black ML & AI contributors. 👇
Deb Raji is a
@Mozilla
Fellow working on AI accountability & model auditing. Raji has worked on many crucial ML papers like Model Cards.
@rajiinio
Read her work:
We're big fans of
@huggingface
.🤗 Check out this blog on how to auto-log model metrics and parameters to Comet from the Hugging Face transformers library.
#AI
#ML
The recording of
@SanhEstPasMoi
's talk on productionizing NLP models is up!
Check out the awesome work that the
@huggingface
team is doing with
#NLP
and
#NLG
models within conversational AI 🗣
Watch the recording here:
With transfer learning, should you use pre-trained models for feature extraction or fine-tuning?
@bhutanisanyam1
reviews a great paper from
@seb_ruder
and
@mattthemathman
et al to compare how these approaches perform on NLP tasks
See the full overview –
The 4 stages of a machine learning project lifecycle:
1. Project scoping
2. Data definition and preparation
3. Model training and error analysis
4. Deployment, monitoring, and maintenance
We built Comet to streamline the process. Get started today:
Comet is Growing!
Welcome aboard Comet's new Data Science Advocate/Evangelist Ayodele
@DataSciBae
! We're excited to have you here creating helpful content for our users.
Are you open to roles in
#MLOps
& helping Data Scientists track their experiments?
Calling all Machine Learning enthusiasts! Mark your calendars for June 22nd, 8 AM PT for a free webinar on LLMOps + Model CI/CD with Comet .
Register today for free and unlock the next level in your ML journey: .
#webinar
#LLMOps
#PromptEngineering
🧵 (1/8) ICYMI: Our public
#AIArtwork
gallery, powered by
@kvfrans
'
#CLIPDraw
,
@Gradio
, and Comet is live and ready for submissions!
👇TL;DR
Submit your prompt:
View the project:
Full blog post:
🚨 New Comet integration: You can now create a
@Gradio
demo & include it in your Comet dashboard with just 1 extra line of code, so your team can better understand & explore your
#ML
models.
🛠️ Colab Notebook:
📄 Learn more:
Thrilled to announce a new partnership with the team over at
@gitlab
!
Combining GitLab's powerful CI/CD pipeline tools with Comet's experiment management & visualization capabilities,
#ML
teams can much more effectively streamline their ML workflows.
1/6
ArXiv Sanity provides a much more manageable interface for the latest ML papers with thumbnails and abstract for preview PLUS recommendations for similar papers! ✍️
Huge kudos to
@karpathy
for creating this awesome tool!
A new slide deck from
@joelgrus
(yes, the "I don't like notebooks" guy of JupyterCon fame) that depicts his long journey to data science + what he still sees as lacking within the discipline today — from proper unit testing to irreproducible workflows.
🛠️ Prompt Engineering is the most cost-effective strategy to leverage LLMs for your applications. But current prompt engineering workflows are incredibly tedious and cumbersome.
🔥 CometLLM is a new tool that documents all your experimentation with LLMs.
🧵👇
LangChain 🤝
@Cometml
Excited to announce an integration with CometML: they recently added a prompt playground, prompt history (can visualize chains), and prompt usage tracker
All needed to help take your LangChain apps from prototype to production
Every bank deals with the same problem:
Credit card fraud.
Here is a tutorial showing you how to use autoencoders to detect fraud.
(Includes the full source code.)
To deal with the subjective rules of grammar, Google uses machine translation models to suggest corrections.
These models work by treating text with incorrect grammar as the “source” language and correct grammar as the “target.”
Read on here —
This project from Victor Sanh (
@SanhEstPasMoi
) and the
@huggingface
face blew us away🔥
The Hierarchical Multi-Task Learning (HTML) model can learn embeddings that can be shared between different semantic tasks
See the demo here —
Kangas is still in its early days, but super excited to see this growth. We've gotten code contributions and stars from MLEs at companies such as Dell EMC, IBM, Oracle, LEGO, Wix, Google, Autodesk, Docusign, and more!
Try it out:
How do you improve your
#DeepLearning
setup in two minutes?
Integrate your code with "Automatic Logging." A few lines of code, and you'll never go back.
Here is how it works:
1 of 5
Today in Heartbeat: With
@LangChainAI
's LCEL, users can adopt a declarative approach to chain composition, facilitating operations like streaming, batch processing, and asynchronous tasks.
Learn more in this guide from
@DataScienceHarp
:
Today in Heartbeat: Omale Happiness uses Python to develop a supervised learning text classification model that can predict the sentiment category of a sentence, and builds an interactive web application that displays the result to the users.
Today in Heartbeat: With the rise of
#LLMs
like
#ChatGPT
and
#Bard
, a new, lucrative field has emerged.
#PromptEngineering
is the art of guiding language models with clear, detailed, and well-structured prompts.
@BasakBuluz
explains more here:
🚨 Upcoming MeetUp!
We're restarting our industry Q&A series next Monday, July 26th, at 11am ET.
We’ll be chatting about collaborative
#ML
tools with
@jakubjurovych
of
@deepnotehq
&
@abidlabs
of
@Gradio
.
👉 Register for free, & bring your questions!
Truly amazing work by Jonathan Barron from
@GoogleAI
Check out his presentation on an adaptive loss function that automatically adapts the robustness of the loss during model training
Very cool app (and end-to-end machine learning pipeline) idea for automatically tagging Github issues!
Hooks together the Github API, BigQuery, Flask + more and tested with the
@kubeflow
repo 🏷
See the full run down from
@HamelHusain
and team here:
Comet ML partners with
@Uber
AI on Ludwig allowing users to track Ludwig-based experiments live as they are training. Ludwig is a no-code deep learning toolbox developed by Uber. Read the announcement and get started with Comet and Ludwig today:
Object detection is the most popular application of computer vision.
But how can you systematically find the best object detection model for a particular task?
Check the thread for the full-code tutorial using TorchVision.
Love these
@feedly
NLP virtual breakfasts!
The latest edition with
@SanhEstPasMoi
of
@huggingface
on Hierarchical Multi-task Approach for Learning Embeddings from Semantic Tasks is excellent 🚀
Proud to be part of
#CodeCarbon
, an amazing initiative with
@Mila_Quebec
,
@BCG
, and
@haverfordedu
, that will help organizations measure the environmental impact of
#AI
.
Learn more about this incredible work:
Can AutoML match hand-crafted machine learning models?
Adrian Rosebrock (
@PyImageSearch
) puts Auto-Keras to the test in this great classification tutorial
See the full post here —
Congratulations to
@FrancescoLocat8
and team for their Best Paper Award at ICML 2019
Check out the blog post explaining their paper "Challenging Common Assumptions in the Unsupervised Learning of Disentangled Representations"
In Prompt Engineering for Vision Models, taught by
@anmorgan2414
@JacquesVerre
and
@KaiserFrose
of
@Cometml
, you’ll learn how to prompt and fine-tune vision models for personalized image generation, image editing, object detection and segmentation. The prompts you'll use for
🔥In a continued effort to close the gap between Data and Experiment Management, we are very excited to announce a new partnership with
#Snowflake
! ❄️
Comet now offers support for tracking and versioning SQL queries, as well as logging sample DataFrames!
Introducing a brand new suite of Prompt Engineering tools from Comet! With advanced capabilities like Prompt Playground, Prompt History, Prompt Usage Tracker & seamless integration with OpenAI & LangChain, streamlining LLM workflow was never this easy.
:
Want to see a neat trick on how to log an interactive plot in Comet?
Let's say you have a ploty figure called py_fig
Use the following command in your code!
experiment.log_html(py_fig.to_html())
Then go to the HTML section to see your interactive plot 💪
RSVP for our next Meetup to hear speakers from
@XaxisTweets
,
@huggingface
, and
@runwayml
discuss productionizing machine learning models!
See the full details for March 7th here:
Unlock the true potential of your ML workflows like Shopify did with Comet's Model Registry and Experiment tracking tool. Learn how Shopify served machine learning models across multiple teams with a robust solution for real-time predictions.
Welcome to Friday!
To celebrate, we like looking at machine learning memes — thanks to
@ericjang11
for curating 🤓 (and the rest of his blog is pretty spectacular)
We built a few cool integrations during 2022:
• YOLOv5
• Metaflow
• SpaCy
• Catalyst
• Anomalib
• RecList
• Ray
• Pythae
• Run:ai
@Cometml
now works with some of the most popular machine learning libraries and frameworks out there!
#FoundationModels
aren’t perfect for every use case; they know a little about a lot, but many real-world
#AI
applications need to know a lot about a little
Learn how Autodistill lets you benefit from these massive models without deploying them explicitly:
Our Model Production Monitoring gives you access to 3 fundamental levers:
1. Identify if your model is struggling
2. Track drift across inputs and outputs
3. Get alerts if anything is wrong
Check this out:
Calling all Machine Learning enthusiasts! Mark your calendars for June 22nd, 8 AM PT for a free webinar on LLMOps + Model CI/CD with Comet .
Register today for free and unlock the next level in your ML journey: .
#webinar
#LLMOps
#PromptEngineering
Any ML engineer should be using these tools: Google Colab,
@huggingface
transformers, Comet, Scikit-Learn, and PyTorch.
See
@Comejoinfolks
use all of the above to build a text classification model.
#ControlNet
is a tool for
#StableDiffusion
’s web UI that fuses art with QR codes.
Although these codes are sometimes practically invisible to the naked eye, they are all still fully functional.
Check out the repo to start making your own:
🎉 Super excited today to welcome Harpreet Sahota (
@ArtistsOfData
) to the Comet fam as a Data Scientist on the
#Growth
team! We're so thrilled to build with you.
And be on the lookout for all kinds of awesome content, projects, and more from Harpreet and the rest of our team.
This article will show you how to:
1. Finetune TorchVision’s pre-trained models
2. Handle different image annotation formats
3. Calculate evaluation metrics for object detection
4. Compare different object detection models
Join us for our MEETUP with
@huggingface
on running large scale state-of-the-art NLP models with
@SanhEstPasMoi
and
@MorganFunto
! On top of building one of the best NLP libraries out there HuggingFace also uses Comet for experimentation and optimization!
🎙️
#ConvergenceConference
speaker announcement!
@ShivikaKBisen
, Lead Data Scientist at
@PAXAFE1
, who will present "Testing ML models for production" at Convergence on March 2.
Register now to catch this and all the other great talks!
💬 Unlike traditional machine learning projects,
#LargeLanguageModel
evaluation requires extensive logging of prompts, prompt templates, responses, tags, and other metadata.
🤯 This can make tracking the performance of
#LLMs
in real-time especially challenging.
🔎 Comet helps
When building with LLMs, you will spend a lot of time optimizing prompts and diagnosing LLMs.
As you put your solutions into production, you need LLMOps tools to track LLMs and analyze prompts.
Here is a demo of how this process might look (use case included):
Step 1 - The
🚩Interested in how software engineers at Google use
#Protobufs
and
#TFRecords
to optimize their
#DeepLearning
pipelines?
🚩Check out this end-to-end tutorial with a deep-dive on everything you ever wanted (and didn’t want) to learn about Protobufs!
📰 Exciting news! We've launched the Comet Newsletter, a weekly dive into
#ML
industry news, projects, resources, & more—along with some unique perspective from our team of experts!
Check out the 1st issue and subscribe—we've got some big things planned!
Just announced: New integrations with
@raydistributed
,
@kubeflow
, and Google Vertex AI strengthen Comet as the only combined experiment management and model production monitoring solution with true enterprise scalability and extensibility.
Read more here:
A huge thanks to
@nycmedialab
for hosting us at Machines + Media 2018 earlier this month!
Watch our CEO give a pitch on how Comet is the future of
#machinelearning
management!
Data Science teams are most effective when each part of the data science hierarchy pyramid is covered. So who makes up a Data Science Team and what skills do they need? Learn more from
@iremkomurcu
.