Who would have thought a story about reading thousands of pages of archived documents could be so riveting?
For anyone interesting in how this task has changed, there are similar meta-stories about, as but one example, the process of analyzing the Panama Papers. The investigative collaboration sharing access to them digitized millions of pages, and ran (among other things) named entity recognition and clustering / graph discovery algorithms on them to find the signal in that particular heap of noise.
If you’re looking for a project exploring / learning some new technology, I can only encourage you to try to find an angle that explores some public data or document set. That could yield something far more interesting than yet another RSS or HN client.
Sure, but how do you find a project like that to enter at a useful stage? I feel like with the Panama Papers, any semi-public plea for help sorting through that data would have thwarted the journalists
The CIA discloses sets of documents from time to time as do other government agencies. The CIA recently (past few years) released a huge collection of PDFs regarding their investigation into psychic phenomenon. I only read a few documents at random of these (bizarrely the CIA seemed or seems to have some success with psychics) but I'd be interested if someone could synthesize them into something meaningful or explain exactly what the CIA was doing, to what extent they believe in psychics, and what evidence they really have.
In short, the Russians had a project to transmit psychic messages on submarines. This was of vital military importance if it would have worked. So the US tried that also, and then they also tried to use LSD (for interrogation, and other drugs) and subconscious hallucinogenic orders (for agents). This worked better, but of course was also shut down (MKULTRA).
They still rely on the equally unscientific lie detector tests, which works even worse than Scientology methods. But very similar.
Data.gov, the public datasets on BigQuery, thebInterhet Archive, etc. The Twitter API is also relatively open, so is Flickr. There’s a lot of data to play around with, even if not all is as scandalous as the Panama Papers (but who knows...)
Yes! This is one of those rare pros who knows how to turn a discovery process into a riveting story ... and in the telling awaken an interest in a time, a place, a people. (Halfway through, I had to know more about the Pedernales.)
It's like the difference between a rock and a geode.
If you enjoyed the article, I'd highly suggest "The Power Broker" by Robert Caro (the author). It's the best non-fiction book I've ever read and the only 1300 page book that I felt was too short.
Although I join you in deploring the sloppy usage of "enormity" to merely mean big size, I thought the word perfectly captured the _scariness_ of the size of the research task ahead. Centuries of usage agree: https://www.merriam-webster.com/dictionary/enormity#usage-1
I spent a couple hours in the LBJ library/museum last year and Caro is certainly correct - it’s striking to see the boxes stacked for what feels like miles.
Johnson's payola power broker lifestyle was mostly run of the mill stuff. His relationship with Bobby Baker which was open and completely insane -it would be nice to know more about this.
For anyone interesting in how this task has changed, there are similar meta-stories about, as but one example, the process of analyzing the Panama Papers. The investigative collaboration sharing access to them digitized millions of pages, and ran (among other things) named entity recognition and clustering / graph discovery algorithms on them to find the signal in that particular heap of noise.
If you’re looking for a project exploring / learning some new technology, I can only encourage you to try to find an angle that explores some public data or document set. That could yield something far more interesting than yet another RSS or HN client.