I find myself surrounded by networks of networks, and I’m neither precisely sure how I arrived here, nor where I’m going. Which begs the question “Where to begin?”
OK, you’re right, at the beginning, chronologically speaking.
Attention Conservation Notice:
When I’m not sure what I’m doing, I sometimes sit down and wax poetic about it until things become clear. This is one of those times where there are a bunch of balls in the air. I guess I’m … trying to scope the Malign Influence Operations Safari for second quarter, and I haven’t had a lot of time to ponder it. If you’re not into navel gazing, feel free to excuse yourself …
2000s History:
I completed some advanced training right at the turn of the century, earning the Cisco Certified Network & Design Professional ratings, which is akin to a masters in the telecommunications networking business. Over the years I’ve worked on TDM (time division multiplexing) phone system networks, ATM (Asynchronous Transfer Mode) cell relay networks, Ethernet/Token Ring/FDDI enterprise LANs, and point to point and multipoint microwave networks. I got so good at it they paid me to teach others.
Out of that experience I periodically rewrite The Shape of Cyberspace, which is a layman’s introduction into the underlying layered topology of the internet.
2010s History:
I first laid hands on Maltego in the fall of 2010. Literally the very first transform I ever ran for Twitter got me a visit from the FBI counter-terror squad. There was some absolute crackpot “Baptist minister” with a “church” in a run down trailer in South Carolina who’d been waving guns at me over the internet. He made some veiled references about a personal visit, asked people to pray from him, and he disappeared for a twenty four hours, when previously he’d not gone more than six hours without Tweeting for months. I made a public fuss about a sudden trip, he was back at his computer and gun waving again twelve hours later … then he got a door knock.
I was immediately sold on the concept of link analysis.
Social networks are VERY different from timeslots, cells, or frames on a wire. I didn’t own these networks, I was an onlooker. Instead of frameworks like Wireshark there were graphical marketing tools, things like Maltego, and later Gephi, where acquiring data cost time and often money. What began as a few simple Perl scripts in 2010 had evolved to a clustered system running ArangoDB, Elasticsearch, and RabbitMQ, with a collection of applications cobbled together with Python.
There isn’t a book shelf for SNA like the one you see for Cisco products and networking technologies. I had a good intuitive start, I took Lada Adamic’s class on Coursera, and then I’ve periodically made attempts to complete Matt Jackson’s Social and Economic Networks since 2015. It’s a graduate level class, my statistics-fu is not strong, and life gets in the way. But I am wiser for having made the attempts.
My capstone in this area was publishing a recording of the information operation that led to the Capitol Siege, some 220 million tweets and profiles I captured between July of 2019 and through a month after the attack. I know some things, but they’re hard to convey to others who don’t have training in the field, and the powers that be were even then not ready to admit we had an existential threat to our democracy.
2020s Evolution:
I still look at computer networks, but not so much at the level of what’s on the wire, the focus has been more DNS, SSL certs, domain registrations, and the like. That’s a combination of Maltego and RiskIQ.
I still look at social networks, only now there’s less of the enormous single source volume I once had. This hand drawn network of sixty Substacks and their connections has done more good than that enormous body of Twitter content. The second Maltego graph is a sketch of various activist groups, which I started keeping because some individuals in them just HAD to be separated, in order to keep the peace. I’m not naming names, but I am laughing a bit at the folks in that graph who are also subscribers here. Ell. Oh. Ell.
And now there’s Semrush and its browser extension friend SEOQuake. And it’s got this terrible tease graph inside.
That’s a good tool for what it is. It shows the domain you’re examining, it’s direct connections, and THEIR direct connections. Blue is you, green are good sites mentioning you, and red denotes things that may be problematic. I want to see this for a cluster of over a hundred domains and there’s not a web interface for that.
I started messing with Python, then broke out Gephi for the first time in months. I suspect, given the cost of Semrush API access, that this is all going to get cached in an ArangoDB database. So the results are weeks away, not minutes like they’d be with the purpose built tools like Maltego and its transform service providers.
Conclusion:
Akin to The Shape of Cyberspace, the SEO space is a multilayered network with multiple types of entities. What I see thus far are:
Domains which have a weight in the form of an Authority Score.
Links (outbound) and backlinks(inbound).
Many sorts of marketing tech identifiers for sites, with Google Analytics being the most prevalent among the thirty or so I’ve encountered.
There are ad exchange networks, which are a whole sub-universe of their own.
There is a role for the Maltego/RiskIQ facet of the internet, but it’s greatly diminished compared to what it means to attribution.
There is a Social Media section in Semrush, but it’s also very “self centered” - it wants logins to the social networks you use in the domain(s) you’re managing, and then you get your information, and only a little insight into opponents.
There is a temporal component but only some of it is explicitly available and I’m not sure if the time resolution is suitable for the cascade spotting I’d want to do.
What we’re going to do for the Malign Information Operations Safari is centered around the book Information Operations Recognition: From Nonlinear Analysis to Decision-Making. As I said at the start, this is technically demanding material.
So what I think I have to do to keep second quarter from being something only a handful of people can use is:
Always do some work with the minimal setup described in Two New Tools: SEOQuake & Semrush, so those who are just starting have a place to stand.
There must be periodic departures into the collection work that happens before analysis, which is similar in spirit to much of the prior work here.
The outputs have to be formatted in an comprehensible fashion for those who are directing activities, so that they can easily see the leverage points available.
There are several distinct audiences for this material, so introductions need to be good summaries, and Attention Conservation Notices need to let folks know which audience would be best served by the full article.
There is a deep need to get at least one site with Google Analytics that I can use as a lab rat for this stuff, I do not do this on the things that are overtly mine, and I am unwilling to admit to anything I might have access to that does have GA running.
Wow. OK. So this is one of those career change sized things that just pops up and grabs your attention. I guess I’d best get crackin’ …