The importance of MetaData to the NSA.

rs

Dagobah Resident
Not sure if this has been seen yet. It is an interesting analysis of the importance of metadata.

http://kieranhealy.org/blog/archives/2013/06/09/using-metadata-to-find-paul-revere/

I have long maintained that all you need to know to understand the economy is 6th grade math. That is not true in this case. The agency mathematicians deal with things that most college students would not even RECOGNIZE as being math... There is a reason why the agency is not listening to the phone calls and that is because you can get 95% of the information you need (i.e. want...) from the metadata at <1% of the cost. It is very expensive to listen in to all of that phone traffic; obviously you cannot use humans to do it, it would require a tremendous amount of computer horsepower and give results containing lots of ambiguity. Not so with the metadata. As shown in the above analysis, the results of the metadata are clear, unambiguous, and easily obtained with modern computer systems. With the current state of the art, multiplying a million by million matrix is trivial.

The reason why the agency needs *all* of the data is that the math yields much less reliable results with a sparse matrix. If you have "probable cause" to investigate John Adams and restricted yourself to only that dataset, there would be no way to have Paul Revere's name pop up without the entire dataset being available.
 
rs said:
Not sure if this has been seen yet. It is an interesting analysis of the importance of metadata.

http://kieranhealy.org/blog/archives/2013/06/09/using-metadata-to-find-paul-revere/

I have long maintained that all you need to know to understand the economy is 6th grade math. That is not true in this case. The agency mathematicians deal with things that most college students would not even RECOGNIZE as being math... There is a reason why the agency is not listening to the phone calls and that is because you can get 95% of the information you need (i.e. want...) from the metadata at <1% of the cost. It is very expensive to listen in to all of that phone traffic; obviously you cannot use humans to do it, it would require a tremendous amount of computer horsepower and give results containing lots of ambiguity. Not so with the metadata. As shown in the above analysis, the results of the metadata are clear, unambiguous, and easily obtained with modern computer systems. With the current state of the art, multiplying a million by million matrix is trivial.

The reason why the agency needs *all* of the data is that the math yields much less reliable results with a sparse matrix. If you have "probable cause" to investigate John Adams and restricted yourself to only that dataset, there would be no way to have Paul Revere's name pop up without the entire dataset being available.

Well, that is rather interesting.

Might explain why the NSA wanted all of Verizon's call data.

It also could explain why the wrong people are put on "no-fly" lists and that kind of thing. It's not necessarily that they have the same name as somebody else, but that the result of processing metadata gives iffy results sometimes. So, they "err on the side of caution"...

Hmm.
 
Mr. Scott said:
rs said:
Not sure if this has been seen yet. It is an interesting analysis of the importance of metadata.

http://kieranhealy.org/blog/archives/2013/06/09/using-metadata-to-find-paul-revere/

I have long maintained that all you need to know to understand the economy is 6th grade math. That is not true in this case. The agency mathematicians deal with things that most college students would not even RECOGNIZE as being math... There is a reason why the agency is not listening to the phone calls and that is because you can get 95% of the information you need (i.e. want...) from the metadata at <1% of the cost. It is very expensive to listen in to all of that phone traffic; obviously you cannot use humans to do it, it would require a tremendous amount of computer horsepower and give results containing lots of ambiguity. Not so with the metadata. As shown in the above analysis, the results of the metadata are clear, unambiguous, and easily obtained with modern computer systems. With the current state of the art, multiplying a million by million matrix is trivial.

The reason why the agency needs *all* of the data is that the math yields much less reliable results with a sparse matrix. If you have "probable cause" to investigate John Adams and restricted yourself to only that dataset, there would be no way to have Paul Revere's name pop up without the entire dataset being available.

Well, that is rather interesting.

Might explain why the NSA wanted all of Verizon's call data.

It also could explain why the wrong people are put on "no-fly" lists and that kind of thing. It's not necessarily that they have the same name as somebody else, but that the result of processing metadata gives iffy results sometimes. So, they "err on the side of caution"...

Hmm.

Yes, its much easier to process the externals as opposed to the internals. The externals (meta-data) can be used to do efficient targeting of internals based on whatever criteria is specified. Where this appears to be heading is arrest/incarceration based on meta-data, and "conviction" based on manipulation of internals - all under the cloak of "National Security".
 
Back
Top Bottom