Entity Resolution

Spock has a problem they want to pay someone $50K to solve for them:

A common problem that we face is that there are many people with the same name. Given that, how do we distinguish a document about Michael Jackson the singer from Michael Jackson the football player?

That is worth far more than $50K, it seems to me, since it directly impacts all our privacy, not to mention the future of criminal investigations.

The Chinese government is already working on a solution, from a slightly different perspective:

Police in China, where most of the 1.3 billion people share just 100 surnames, are considering rules which would combine both parents’ family names to prevent so much duplication, state media said yesterday.

[…]

“By adopting both parents’ names, 1.28 million new surnames will be added, which will greatly solve the problem of name duplication,” Xinhua news agency said, citing the regulations.

This is just the beginning of the problem. Future generations may more commonly treat names as an evolutionary thing, rather than static. So the question will become how to tie together a history/path of names throughout someone’s life.

On a related note, mandarintools.com has a Chinese name generator. It would be funny if it only had ten names to choose from, based on the latest reports coming from China, but unfortunately it actually tries to create uniqueness.

Every time I run the program I get a completely different answer. Or should I say a differently resolved entity?

Bush may intentionally violate data-retention laws

It’s not just about explaining how/when the President does not have to honor seatbelt laws. Now it’s about data retention violations too:

“Given the heavy reliance by White House officials on RNC e-mail accounts, the high rank of the White House officials involved, and the large quantity of missing e-mails,” the report said, “the potential violation of the Presidential Records Act may be extensive.”

Republicans said there is no evidence that the law was violated or that the missing e-mails were of a government rather than political nature.

The records act requires presidents to assure that “the activities, deliberations, decisions, and policies that reflect the performance” of their duties are “adequately documented … and maintained,” the report said.

Of course there is no evidence. That was destroyed too, along with the definition of government.

The drag car incident and risk

ESPN’s report on the Tennessee drag car incident has a very troubling quote:

Amateur video of the crash, broadcast on WMC-TV in Memphis, showed the car’s engine revving loudly before the vehicle sped down the highway. After a few hundred feet, the smoking car skidded off the road and into the crowd.

“It’s been a safe event until this year,” Police Chief Neal Burks said Monday.

With all due respect to the Chief, it has not been a safe event until this year. Rather, it has been an event without incident. The two conditions are vastly different and should never be confused when calculating risk.

In fact, I’ll go so far as to say it is a pet peeve of mine to find managers who say they have a safe environment when what they really mean is they are unaware of any incidents. Being lucky is definitely not the same thing as studying data and preparing for predictable outcomes.

I wonder what the Chief would say if he pulled someone over for a safety violation (e.g. speeding, no seatbelt, drunk driving, etc.) and that person said “I have been safe so far”.

The crash occurred at a Cars for Kids charity show, which has been an annual event in this small town 80 miles east of Memphis for 18 years. The drivers always do crowd-pleasing burnouts — spinning the tires to make them heat up and smoke — at the end of the parade.

[…]

Cars for Kids holds several events throughout the nation and raises close to $200,000 annually for charities that help children in need, according to its Web site.

The charity was formed in 1990, two years after founder Larry Price’s son, Chad, suffered a severe head injury in a bicycle accident. Price promised that if his son was saved from lifelong injuries, he would spend the rest of his life raising funds for disabled children, according to the Web site.

So here is an interesting question: Would the crowds come and pay admission if there was less risk (to the driver, the environment, or themselves)? Seems to me there is some questionable judgment and sad irony in using high-risk activities to raise funds to pay for injuries from risky activities. Then again, maybe I’m a bit more sensitive than most to the risks of “burning” tires or “burnouts” for show.

Tires are not made of rubber, they are complex chemical mixtures that will release thousands of chemicals in mixtures that will create new ones, the health hazards of this are unknown. As a cancer researcher I know that mixtures of chemicals in low doses are cancer causing in humans, even if the individual chemical is not.

Would you like some asthma with those fries?