Scientists at Xerox's research laboratory in Webster, NY, are making progress with a technology that automatically protects and selectively reveals information contained in documents, without creating multiple versions of them or hogging as much memory as today's encryption programs do. In essence, the Xerox software analyzes language to determine whether words and phrases (like "alcohol abuse" or "HIV"), taken in context, are private and should be reserved for doctors, or whether, say, a string of nine digits--potentially a Social Security number--should be seen only by a personnel officer. A single stored form or document reveals its parts according to users' authorization levels: someone in the hospital scheduling office might see only a patient's address and phone number.
The researchers are working to make the core technology compatible with existing formats, including PDFs and customized hospital forms. "We've scoured the landscape, and there is no technology out there that marries content analysis with encryption so that the whole process becomes automated," says Shriram Revankar, who heads Xerox's smart-document lab in Webster.Such technology is badly needed, says Kenneth H. Buetow, director of the National Cancer Institute's Center for Bioinformatics in Bethesda, MD. Current privacy laws make it difficult for researchers to share patient data from medical trials. "It represents a potential solution to the sharing of information in compliance with human subjects' privacy protections," says Buetow, who is assembling a cancer research database. He cautions that the technology is still unproven; but Xerox hopes it will be ready to commercialize in two to five years.