06 December 2006

Digitization is never perfect

Perhaps this was covered in the 'capturing' track, but I didn't see it in their presentation:

Is it 100% clear to everyone that NO digitization captures everything, even of a 2-dimensional object like a drawing or photograph? Color representations are always approximate and will depend on lighting choices and conditions; cameras introduce artifacts into the captured images; pixelization is always at a coarser scale than the original objects contain. For 3-dimensional objects the problem is, obviously, far more acute.

As there will always be newer and better ways to digitize coming along (see e.g. the Economist Technology Review 12/2/2006, p.6), SI should anticipate a ongoing need to re-digitize the collections. This leads to a need to view the quality of digitization required not as one value, but as a range of values for different uses. (Getting back to the 'using' track.)

For searching archives a low resolution image may be enough, and may be cheap and quick to create for an entire collection. Instead, to do in depth research on an image, or to make a calendar picture, will require far more; the contents of an image may be encoded - either by inspection, or soon by software - and provide a higher level of digitized information about the collections.

As soon as the digital archive includes more than one image of the same object a way of cross-linking them is needed, not just to identify that they are indeed of the same image, but to pinpoint locations within each image as being of the same section of the original; this implies a scale and a coordinate system. For 3-dimensional objects co-ordinate systems are even more important.

Having multiple images of differing quality leads to the ideas, which we use widely in astronomy projects, of 'versions' of data sets and of 'levels' of processing, with each level doing more to the data.

No comments: