Tackles are flawed statistics. Thanks to Quang Nguyen (
@qntkhvn
), Larry Jiang, and Meg Ellingwood in this year's
#BigDataBowl
- we have created a new defensive metric based on halting the ball-carrier's forward progress: Momentum-Based Fractional Tackles
I find it hard to believe that 10 years ago I graduated high school to begin
@CarnegieMellon
- now I'm incredibly excited to share I will be an Assistant Teaching Professor
@CMU_Stats
! I owe everything to my amazing advisors, mentors, friends, & family for all of their support!
@nnstats
It's because the majority of individuals saying you should work unpaid are white men that only hire other white men from their alma mater with rich parents that paid for their entire college education and are debt-free. - my TED talk.
@ProFootballTalk
The framing of "analytics" in this manner, leads me to believe that both you and the coach that you spoke to have no idea what that word means
Students in my
#sportsanalyics
class are going to complete critiques of analysis they read online throughout the semester, initial list of potential sites so far (I won't have a comprehensive list but I'd greatly appreciate any recommendations)
Does anyone have a list of
#sportsanalytics
websites by sport? I don't mean in terms of data, but rather articles (e.g., Fangraphs and Baseball Prospectus for baseball, and so on)
Here it is,
#nflscrapR
research by
@stat_sam
@bklynmaks
and myself has led to this -
#nflWAR
: A Reproducible Method for Offensive
Player Evaluation in Football -explaining our full approach to estimating WAR in football and how to extend it with more data
Congratulations to Dr. Ron Yurko
@Stat_Ron
on his
@CMU_Stats
PhD defense on “Selective Inference Approaches for Augmenting Genetic Association Studies with Multi-Omics Metadata”! Co-advised by Kathryn Roeder
@RoederKat
and Max G’Sell.
#cuethestarwarsgifs
And it's in -
@kpelechrinis
and I have officially submitted our entry for the
#BigDataBowl
: A ghosting framework for evaluating defender ability to limit YAC
Here's how you win in the
#NFL
- you throw the ball better than the other team. Run the ball better than the other team? Eh who cares... you don't outscore your opponents if you can't throw - all I'm using is the difference between a team's EPA/Att for OFF and DEF from
#nflscrapR
Major personal news: after much consideration I’ll be officially following
@AB84
from Pittsburgh to Oakland to join
#Raidersnation
as an Data Science and AI specialist for Coach Gruden!
@nflscrapR
will be removed from GitHub today including all of the data
#endofanera
Students in my
#sportsanalyics
class are going to complete critiques of analysis they read online throughout the semester, initial list of potential sites so far (I won't have a comprehensive list but I'd greatly appreciate any recommendations)
A coach told me earlier today that analytics was just a way for people who never would have gotten jobs in football to get jobs in football. To understand that is to understand why they get so defensive whenever anyone questions analytics.
Congratulations to
@CMU_Stats
Ron Yurko
@Stat_Ron
on his successful PhD proposal on “Selective Inference Approaches for Augmenting Genetic Association Studies with Multi-omics Metadata”! Co-advised by Kathryn Roeder
@RoederKat
and Max G’Sell
#cuestarwarsgif
Major
#nflscrapR
update is live! New functions for gathering games and scraping ALL of the NFL API’s play-by-play data meaning player ids for everyone involved in tackles, pass defense, laterals, etc
#nflscrapR
#rstats
#moredatamorefun
For anyone that wants an easy intro to making R packages (which you basically should do for any big project in R), check out this simple step-by-step guide by
@leerichardson09
- I have it open every time I'm making a new one
#rstats
#CUDAS
Tweets like this are a reason why I dislike a lot of football analytics twitter and discouraged to participate in it, this isn’t constructive (and I’m still not even convinced the analysis about Rodgers is appropriate) - twitter “dunking” is a waste of time, I’m done with this
The funny thing about the 49ers winning by rushing for ~300 yards in this game is that people can't even use it to try to dunk on analytics twitter because 2 of the beliefs of analytics twitter is that this year's Packers were frauds and that Rodgers isn't good anymore, and, well
Okay -
@quarto_pub
is amazing and SOOOO much easier to use for making an academic website, this was super easy to put together and more importantly will be simple to update:
#quarto
#rstats
Long overdue refresh of the NFL analytics staffer list after an offseason with plenty of movement.
As always, this is to the best of my understanding based on both what teams list and conversations with analytics folks around the league.
We need a
#sportsanalytics
retraction watch, because this is a great retraction candidate - an example of data analysis without any notion of statistical thinking, that will mislead the broader public in making conspiratorial conclusions (this gif applies to the whole thread)
The NFL pushed the panic button 🚨
They have a ref in their rotation who is a MASSIVE edge to road teams
Road teams win at the
#1
highest rate with him...
He penalizes home teams in ways no other ref does... and...
He's calling the Chiefs road game
Anyone in
#sportsanalytics
community put together a list / repository of various publicly available data sources across different sports? Like I'm thinking different
#rstats
and python packages, websites to export CSVs, etc - any help appreciated!
BREAKING NEWS:
Applications are now open for
@CMU_Stats
Summer Undergrad Research Experience!
Program theme
#DataScience
in
#SportsAnalytics
aka
#CMSACamp
!
Set for *in-person* Jun 6 - Jul 29, w/ provided housing, plus a $4000 stipend!
Deadline: Mar 23rd!
Top 25 players in 2017 NFL season based on
#nflWAR
methodology, of course Russ
#1
, only non-QB is Antonio Brown but is ranked 25th... Jameis Winston's air WAR > total WAR because of his negative rush WAR, full paper details here
#nflscrapR
#rstats
#dataviz
A couple years ago I would spend time just churning out several plots using
@nflscrapR
EPA/WPA - try to tweet it at popular folks like
@minakimes
@billbarnwell
hope for some retweets etc to promote
@nflscrapR
and to be honest it felt like I was just yelling in an empty room
BIG NEWS:
@CMU_Stats
@CMUAnalytics
is hosting the first ever
#CMSAC
Reproducible Research Competition! Requiring the use of publicly available data and access to all your code/analysis - we're hoping to make
#sportsanalytics
research reproducible!
#CMSAC18
@ScoutWithBryan
@nnstats
Thank you for letting us know how privileged you were in life to be able to take an unpaid internship with a NBA team in college
#Packers
QB Aaron Rodgers received homeopathic treatment from his personal doctor to raise his antibody levels and asked the NFL to review his status. The NFL, NFLPA and joint docs ruled him as unvaccinated. Now, he has COVID-19.
More here:
Great time today w/ Simon Fraser
#sportsanalytics
group discussing '
#NFL
Ghosts: A framework for evaluating defender positioning using high-dimensional conditional density estimates' - fantastic virtual speaker series w/ all talks available online:
Second
#blogdown
post in my
#BayesianBabySteps
series - intro to using Laplace approximation to model
#NFL
score differentials, determining the value of passing efficiency relative to rushing with EPA from
#nflscrapR
and the effect of strong priors
#rstats
Problem prevalent in
#sportsanalytics
discourse: overconfidence and misunderstanding of expectation, coupled with lack of respect and understanding of variance
Great read from Kevin - for all the discussion about Shanahan's decision making, Mahomes being inevitable, etc., the turning point of the game (objectively and subjectively) was a punt that hit Darrell Luter Jr's foot followed by Ray-Ray McCloud III failing to pick it up
This is the part I can't stop thinking about - like if you seriously continue to think the US healthcare system doesn't need a radical overhaul while living through all of this then either you're rich or just so far gone in terms of political vitriol
.
@edyong209
: “A country that, 7 months into a pandemic, still cannot ensure that its healthcare workers have enough gowns and gloves and protective equipment is not going to be able to distribute a vaccine in an efficient way. It simply isn’t.”
And now
@nflscrapR
has been used in a
@WSJ
article - we (
@bklynmaks
@stat_sam
and me) are pretty proud of what we’ve accomplished with this
#rstats
package, it’s a great lesson on sharing your work, code, AND data so ANYONE can use it (students, journalists, fans, hobbyists etc)
Looks like you can now access online the official publication of the culmination of
@nflscrapR
research
#nflWAR
by
@stat_sam
@bklynmaks
and myself, as it officially heads to print - crazy to think how long ago it was now as I start PhD year 3
@CMU_Stats
BREAKING NEWS:
#CMSACamp
is back! Applications are now open for
@CMU_Stats
Summer Undergrad Research Experience!
Program theme is once again
#DataScience
in
#SportsAnalytics
!
Set for Jun 5 - Jul 28 w/ provided housing + $4000 stipend!
Deadline: Feb 26th!
Initial advice I've recently been giving people whenever they ask me about entering
#sportsanalytics
:
1) Make a twitter account and follow
@StatsbyLopez
@nnstats
2) Look here
3) Find and work on a project that interests you, then promote it like crazy
Never received my PA mail-in ballot, went in person, was mocked by people for requesting a mail-in, and waited until they allowed me to do a provisional - I voted for people I care about since as a white male US citizen I'm privileged to not be affected by the outcome
#GoVotePA
A source gave us unprecedented access to 73,000 MLB scouting reports.
@BenLindbergh
will break down his findings in a three-part series this week. First up, how good are scouts at projecting success?
This is what a
#sportsanalytics
class looks like
@CMU_Stats
w/ Bayesian state-space modeling in Stan following the class Glickman & Stern model... a fun, simple example for
#NFL
team ratings in the Patrick Mahomes era (2018-present)
#dataviz
Going to cover the classic Glickman
& Stern (1998) model in
#sportsanalytics
class next week with Stan demo, naturally
@StatsbyLopez
& co. already did this (in BUGS)
Just found out we won the student poster prize at
#NESSIS
! All credit goes to Riccardo ( was busy in London w/ ML fairness) and Natalia! Such a fun project about the wild world of Ultra Trails, check out Riccardo's package here
Excited to share my first
@PNASNews
publication with my advisors Max G'Sell, Kathryn Roeder, and Bernie Devlin: 'A selective inference approach for false discovery rate control using multiomics covariates yields insights into disease risk'
This is essentially the main reason why I find events, such as the Sloan conference (yes I’m more than willing to call it out), to be problematic - they’re creating such a high barrier to entry propagating an obvious problem in
#sportsanalytics
, it needs to be inclusive
#dobetter
I also think it’s important to note that they’re not arguing that analytics aren’t important—just that their growth will crowd some folks out, because of issues like pipelines. Which are the same issues that crowd out coaches with the *wrong* positional resume, for example.
Holy crap,
@nflfastR
makes it easy to calculate quantities that you want. Thank you,
@benbbaldwin
,
@Stat_Ron
and everyone else that has made this possible. Download the csv files here:
This has definitely been the most frustrating part of seeing media reaction to the forecasts - "the worst-case 95th percentile projection is nowhere near reality so these models are awful!"
There's a difference between what the models said and "what the models said" as interpreted by the media, which often emphasized worst-case scenarios rather than the broader range of possibilities they articulated, some of which were conditional on there being no distancing, etc.
Presenting Markov Model of Football by
@keithgoldner
of
@numberFire
in class, so I reproduced his paper's original absorbing Markov chain with 09-17 play-by-play data with
#nflscrapR
- code to construct the transition matrix and basic calculations here
I made another football data package... world football that is - introducing
#fcscrapR
which allows you to easily grab, for any match on ESPN with commentary data, shot attempts, substitutions, cards, fouls! Including
#WorldCup
matches like
#SRB
#CRC
While this is obviously a limited view of QB performance, old man Russell Wilson is a clear upgrade over Kenny Pickett (and probably a better passer than Fields without the need of trading picks) - I'm intrigued by Russ in a heavy play-action Arthur Smith
#Steelers
offense
BREAKING: The FTC just banned non-compete agreements.
The Federal Trade Commission has issued a final rule making it illegal for bosses to make workers sign noncompetes in any scenario, and voiding nearly all existing noncompetes.
This is a game changer for American workers.
Fun first full day with
@WMoneyball
students teaching
#rstats
, seeing how passionate the students are, along with our
@CMU_Stats
summer program students visiting Dan Fox and
#Pirates
analytics tomorrow, reminded me of the most ridiculous email I’ve ever sent... but it worked
Thank you to
@asmae_toumi
for being vocal about this - the leaders of
@SloanSportsConf
need to be held accountable and actually address this prevalent problem to fundamentally change its culture - you are limiting accessibility to
#sportsanalytics
Going to cover the classic Glickman
& Stern (1998) model in
#sportsanalytics
class next week with Stan demo, naturally
@StatsbyLopez
& co. already did this (in BUGS)
Open Source Sports is officially alive on Substack:
This will be a free newsletter where I highlight and translate the latest
#sportsanalytics
research in academic journals that the broader public may not be aware of:
See you there!
@PFF_Eric
What truly blows my mind about this, is the context with professional athletes regarding all the other supplements, crap, etc. they take with well known side-effects that can be pretty awful - and yet they won't get vaccinated???
Congrats to Rishav Dutta
@rishavd64
! His work on using Gaussian mixture models for
#NFL
pass coverage with
#BigDataBowl
tracking data was accepted for publication in JQAS! He worked incredibly hard, transforming this paper from its first version:
Competing in
#NFL
#BigDataBowl
? Check out our updates to
#GoingDeep
available on
@arxiv
here: Improved model performance with better covariates and demonstrated the use of RFCDE for continuous-time
@nflscrapR
EP and WP values
Spent some time updating my CV and website, and finally decided to start blogging to share more info about my projects and what I’m learning
#mistakeswillbemade
#blogdown
If you're new to tracking data and looking for some example
#BigDataBowl
code -
@kpelechrinis
and I posted the code for our submission last year here: which also includes 'Going Deep' scripts
#rstats
Today we "officially" wrapped up the 2 month
@CMU_Stats
#CMSACamp
#sportsanalytics
themed undergrad research experience - I was fortunate enough to be the instructor of an incredible group of students who you should pay attention to as they grow in their careers & research
1/n