Playback speed
×
Share post
Share post at current time
0:00
/
0:00
Transcript

Using Open Semantic Search

MIOS has people interested in digging.

When I posted Malign Influence Operations Safari I had no idea what I was setting in motion for Q2 2024. Today there is a story brewing based on an international leak, there’s a domestic leak that’s getting some attention, and there’s a mess of a FOIA that is finally corralled, just one update left before it’s deemed complete.

Open Semantic Search is orphanware. The original developer has wandered off and while I learned all aspects of building the system, I’m not sure I would take on maintaining it, even if there were money on the table. We’re still running what was created as a single Debian 10.5 appliance four years ago, while the world has moved on to Docker based solutions.

That’s just me whining about logistics, technical debt, and the never ending wave of software updates I must surf. We have a massive pile of documents, a quick check shows 840k items taking up 197GB of disk space. We have a working private faceted search engine for which I can make a new instance in about half an hour, then it will grind through whatever we give it to index.

You are all invited to have a look around the Disinfodrome Documents, 14k items that are focused on Trump Russia investigations. There are 425k documents in the Trump FEC system and I guess there’s no harm in throwing that open to the world, too. What’s in there are the massive final reports, which get diced into single pages for easy handling.

The FEC system is why the world knows the Trump campaign was paying James Troupis’ law firm at the end of 2020. I literally stumbled upon this, showed it to some people, and things have blossomed from there.

These systems are protected with Cloudflare ZTNA, so if you want access you’ll need to email nealr at pm dot me so I can add you.

Netwar Irregulars Bulletin v2.0
Disinfodrome
Exploring public document caches with Open Semantic Search.
Authors
Neal Rauhauser