rs
Dagobah Resident
Not sure if this has been seen yet. It is an interesting analysis of the importance of metadata.
http://kieranhealy.org/blog/archives/2013/06/09/using-metadata-to-find-paul-revere/
I have long maintained that all you need to know to understand the economy is 6th grade math. That is not true in this case. The agency mathematicians deal with things that most college students would not even RECOGNIZE as being math... There is a reason why the agency is not listening to the phone calls and that is because you can get 95% of the information you need (i.e. want...) from the metadata at <1% of the cost. It is very expensive to listen in to all of that phone traffic; obviously you cannot use humans to do it, it would require a tremendous amount of computer horsepower and give results containing lots of ambiguity. Not so with the metadata. As shown in the above analysis, the results of the metadata are clear, unambiguous, and easily obtained with modern computer systems. With the current state of the art, multiplying a million by million matrix is trivial.
The reason why the agency needs *all* of the data is that the math yields much less reliable results with a sparse matrix. If you have "probable cause" to investigate John Adams and restricted yourself to only that dataset, there would be no way to have Paul Revere's name pop up without the entire dataset being available.
http://kieranhealy.org/blog/archives/2013/06/09/using-metadata-to-find-paul-revere/
I have long maintained that all you need to know to understand the economy is 6th grade math. That is not true in this case. The agency mathematicians deal with things that most college students would not even RECOGNIZE as being math... There is a reason why the agency is not listening to the phone calls and that is because you can get 95% of the information you need (i.e. want...) from the metadata at <1% of the cost. It is very expensive to listen in to all of that phone traffic; obviously you cannot use humans to do it, it would require a tremendous amount of computer horsepower and give results containing lots of ambiguity. Not so with the metadata. As shown in the above analysis, the results of the metadata are clear, unambiguous, and easily obtained with modern computer systems. With the current state of the art, multiplying a million by million matrix is trivial.
The reason why the agency needs *all* of the data is that the math yields much less reliable results with a sparse matrix. If you have "probable cause" to investigate John Adams and restricted yourself to only that dataset, there would be no way to have Paul Revere's name pop up without the entire dataset being available.