I read Information Operations Recognition: From Nonlinear Analysis to Decision-Making about three years ago. I found the OSINT of Chapter 2 and the network analysis in Chapter 4 immediately familiar, but I ran aground hard in Chapter 3, Elements of non-linear dynamics for information operations recognition. Way back when, my sophomore year in college, I shared an apartment with three other guys, one chemical engineer, one electrical engineer, and one nuclear engineer. I recognize the math in that chapter as second year calculus for engineers, but being in computer science I had already made my escape from such things.
Iโve needed a Python oriented replacement for JGAAP for years. Iโve spent a lot of time with Open Semantic Search, which implicitly meant spending some time with spaCy, and because I was in a mode of learning rather than full throttle pipeline tuning, Iโve also had some Natural Language Toolkit adventures. But the only graphical Python thing that makes sense is Orange Data Mining, which I first mentioned in Attribution Using Stylometry. This is something I played with years ago, but it didnโt stick for me.
A couple days ago I had waded through about a dozen of their training videos on statistical matters (the other big gap in my skills) โฆ and then I noticed the four most recent are right in line with what I was needing to do with comparing writing samples.
Longtime readers will recall that I previously wrestled with โbrain fogโ as a result of Lyme disease and its treatment circa 2007 - 2009. New things that should have taken me a month to reach proficiency would require a year โฆ or two โฆ or maybe theyโd just remain out of reach. Switching from Perl to Python was an agonizing drag from mid-2011 to mid-2013. I have started Matt Jacksonโs Social and Economic Networks: Models and Analysis at least FIVE TIMES, and every single time between four to six weeks in I would have a health downturn and stop attending.
But I got my brain back in early June and this change seems to have stuck. There are no words in English to describe the gratitude I feel for this. I can study something for a bit and it starts working for me. WOW.
Visual Programming:
Orange has a visible programming style. Each of the nodes on this graph are a โwidgetโ - loading a file, turning it into a corpus (bag of words for NLP), looking at it, then this embedding thing is a machine learning function. The distances are a least squares stats thing, the clustering bunches words up by their meaning, and t-SNE is some sort of visualization thing I just found. This feels a lot like the Unix command line tool chain environment - simple things you can stick together in order to produce solutions to complex problems.
And what excited me so much is this - the Spectroscopy section. A portion of that math I only vaguely recognize is available as widgets and thereโs a little video training section on using it. Game on!
Or maybe โฆ not so fast!
Danger Zone:
I donโt recall where I first saw Drew Conwayโs piece The Data Science Venn Diagram but it immediately stuck with me.
The Danger Zone! haunts me. There were three and four and five credit hour statistics classes for almost every discipline when I was in school. But computer science and industrial engineering had this diminutive practical stats class that was just two credit hours. I donโt recall precisely why I chose that class, perhaps weariness with battling calculus, but thatโs what I did. They say we use 10% to 15% of what we study in college and I would put that little stats class at the very top of that heap.
Conclusion:
Maybe, now that Iโve got my brain out of hock, 2024 is a time to shore up my less than stellar stats background. The point is this: having those spectroscopy tools in an easy canned format is great โฆ but not if that leads me to a place where I think I know whatโs going on, but Iโve committed some sort of grim unwarranted inference.
I make fun of the folks who goof on attribution using the community edition of RiskIQ, see Sovereign Challenge Theropod Stampede for the particulars. I donโt want to be THAT GUY, wading into some IO attribution problem, and promptly shooting myself in both feet.
So itโs a problem โฆ but one that will no longer be torment for me to solve. I just need to spend a little time filling in some gaps.