Laser Stars

Reviewed in Amazing Computing Amiga in October, 1996.

WEB SITE CONSTRUCTION DETAILS

The Laser Stars site was created with an Amiga model A4000, a 68040 based computer. This page will discuss mostly construction details and web authoring with amigas. For a more complete introduction, Randy Finch wrote a series of 8 articles in Amazing Computing / Amiga on 'Web Typesetting'.



Topics

  1. Hardware and Software
  2. FTP and Long Filenames
  3. CDROMS, FITS and AmiWin
  4. Scanners, Image Processors
  5. Interleaved GIFs
  6. Image Formats
  7. Image Processing
  8. Web Page Backgrounds
  9. Optical Character Recognition
  10. Math Typesetting
  11. Web Browsers
  12. Why Amiga ?

Hardware and Software

FTP and Long Filenames

The amiga's ability to have long filenames and arbitrary length file extensions made the editing and updating of the web site a breeze. By using the OS integrated FTPMount, the public_html directory on my info provider appears just like an amiga drawer. Since I update my web site almost daily, managing the 4 megabytes of hundreds of HTML, GIF and JPEG files scattered over 25 directories would have been prohibitively time consuming without FTPMount.

CDROMS, FITS and AmiWin

Gigabytes of data are now available from the National Space Science Data Center on hundreds of CDROMs from dozens of spacecraft and astronomical observatories around the world. (much of this data is slowly appearing on the net as well) I use a double speed CDROM drive to read astronomical images in FITS format and convert them to GIF with XView running under AmiWin XWindows. The images were converted to JPEG with ADPRO.

Scanners, Image Processors

A hand scanner is used to digitize many diagrams and figures. The figures were scanned at 400 DPI, then 'cleaned up' with DPaint at the original full resolution. Then ADPro was used to scale the image down in size. Then DPaint is used again to clean up any remaining flaws and scaling errors. Some figures were too large to fit within 18 megabytes so I used Virtual Memory Manager (VMM) to get more 'virtual' ram. After reduction single bitplane monochrome images require more bitplanes for 'anti-aliasing' and ADPro does an admirable job in 2, 4, 8 and 16 greyscale renditions with Floyd-Steinberg dithering and appropriate contrast adjustments. More greyscales are sometimes required but often the improvement isn't noticeable and they take up more storage and slow down transmission over the net. GIFTool was used to make transparent GIFs, which look much better when less grescales are used. GIFTool was also used to strip comments from GIF images, to reduce web page transmission delays by about 5 percent.

Interleaved GIFs

I don't like interleaved GIFs because repeatedly looking at an image at various stages of coarseness or resolution wastes time and is distracting. I prefer to see the real thing rather than 'guess' when the figure is sharp enough for detailed examination. If the image is small enough there is no need for interleaving.

Image Formats

Certain images require the full 256 grey scales for correct rendering, in these cases JPEG is the more appropriate image format. Although many color pictures originally contain 256 colors or less, they become 24 bits deep when the are scaled down to thumbnail size. Most 24 bit color pictures are converted to JPEG because the maximum number of colors allowed in the GIF format is insufficient to correctly represent the full range of colors. GIFs are acceptable for certain diagrams where palette reduction is tolerable or on extremely small thumbnails where the quality does not suffer too much. The small size of JPEGs is an important factor when storage constraints are placed on your web site by info providers. JPEGS are also much quicker to load considering the current bandwidth limitations of most internauts.

Image Processing

Their are other tricks to reduce the storage requirements of images : You can apply a mild low-pass convolution filter to the original image before scaling it down. This removes a great deal of noise and some actual high frequency image information. The loss of information is noticeable by the slightly blurry look to the processed image. However when the picture is scaled down the 'blurry' look disappears. The main reason lies in the scaling process. When scaling an image down, much of the very high frequency spatial information is lost in the averaging process. Therefore, applying a low pass filter before scaling doesn't significantly affect the subjective appearance of the final thumbnail image. But it does improve the efficiency of the JPEG compression scheme. Because some of the higher spatial frequency image components are removed, and since JPEG relies on compression in the Fourier Transform domain, this translates to less spectral information required to reconstruct an accurate image. Hence less storage.

Web Page Backgrounds

Although I personally find backgrounds distracting, there are some pages like entertainment or amateur pages where they can be appropriate. Especially if there isn't too much text to read, like indexes. If backgrounds are complicated like a photograph of scenery they should be of low contrast, i.e. the colors should be very close together in RGB space. Because the human eye triggers on edges, it is best to have lower spatial frequencies and to reduce the image's contrast below the threshold where the overlying text is easy to read (low contrast images look greyish and dull). If the image is smaller than the web page size there is an ADPro script called SeamLessMapADPro which can improve the tiling behaviour by smoothing the discontinuity between adjacent tiles.

Optical Character Recognition

To convert countless abstracts and astrophysics papers into computer readable ASCII text, I use the MiGraph Optical Character Recognition software. Since I am slow at typing, it would have taken me 10 times longer to enter this data via keyboard. Also, the accuracy of the recognition algorithm saved me from tedious hours of proofreading. HTML formatting codes were then added to the raw text. Since HTML is a rather simple markup language, I didn't find any advantage in using the various Web authoring tools available.

Math Typesetting

AmigaTEX was used to construct papers in TEX format and create PostScript versions (which are compressed with gzip for network transmission). Unfortunately the newest adopted HTML standard does not allow math, so equations must be converted to 'plain English' format on HTML pages. The output of LatexToHTML looks somewhat awkward. Like those notorious letters in crime movies, made up of individual letters on small squares of various fonts glued together so that the author remained anonymous and untraceable. The number of GIFs increases dramatically in papers that are mostly formulae, with hundreds of GIFS it begins to look too ridiculous. In these particular cases why not just scan in the whole page as GIF like they do on the Astrophysics Data System ? Therefore LatexToHTML was avoided.

Web Browsers

For the greatest compatibility, non-standard NetScape extensions are avoided. Except for CENTER and IMG align="left" attribute. This extra image wrap attribute doesn't affect the appearance of web pages in Mosaic or Lynx but greatly improves readability with NetScape.

Before ALynx, I used the Lynx from the UNIX shell of my info provider. I made extensive trials of AMosaic 1.2 and 2.0 prerelese 3, but concluded that they were simply too unreliable especially on pages with many images. AMosaic 2.0 was particularly unstable with HTML FORMs: In scientific research, very heavy use is made of search engines, abstract retrieval databases, and data retrieval web pages etc... All of which rely on HTML FORMs. AMosaic would crash requiring re-boot many times per hour. It wasn't suitable for serious academic use.

Then along came ALynx, a port of the popular text based Web browser available on UNIX platforms. With HTML FORMS capability ALynx was able to perform most of the tasks required in my research. Its reliability was excellent, after using it for a whole year, it has never crashed !

When IBrowse was released, it was faster and could view certain NetScape pages better than AMosaic. However, the text was often scrambled and many of the inline pictures appeared scrambled as well. It was more stable that AMosaic, but still crashed occasionally. IBrowse was sometimes used, but only to view 'NetScape degraded' pages which could not be adequately viewed with Lynx.

Then AWeb was released with strict HTML 2.0 compliance. It is much more stable than IBrowse and faster because it doesn't require MUI. It rarely crashes, so I switched to AWeb.

Finally, Voyager was released, and it was fast, reliable and was able to display non-HTML 2.0 NetScape extensions such as image wrap (align="left" or RIGHT), also BACKGROUNDS, but not frames. Voyager has an annoying tendency to wait until most of the page is loaded before displaying anything and often only half the information is presented, even clicking on 'reload' does not present the whole page. The only solution I find that works is to click on the 'back' and 'forward' button in rapid succession. Voyager supports frames and many other 'Netscapisms' but isn't as fast as AWeb v2.1 because it depends on MUI.

An interesting AWeb feature I find particularly useful is the parallel loading of multiple web pages: You click on a page with many thumbnails, but the page is unreadable until the GIFs are loaded and you don't want to wait staring at the ceiling, so you click on the return button to continue reading the previous page while simultaneously AWeb is busily loading all the thumbnails of the previous page. When the network status window becomes idle simply click the forward button to return to the image intensive page and find everything already loaded into the buffer. This feature is a very useful form of multitasking. Also, AWeb is able to show partially loaded HTML files. This is useful in combination with the network status window displaying the bytes transferred versus the bytes remaining for each separate parallel network connection. Each separate connection could be the HTML file which is partially loaded and displayed or GIF and JPEG thumbnails. With the click of a mouse you can select out particularly large GIFs and cancel each transfer individually if you lose patience. I haven't seen this feature in NetScape nor in Voyager.

Why Amiga ?

Although Amiga OS 3.0 is more efficient and compact than Windows 95 or OS/2, its software that defines a computer and should be the deciding factor : After comparing Amiga with similar IBM or Macintosh products, I conclude that the Amiga gives you more control over the functioning and output of most programs, especially when connected to an info provider, where multitasking rules. Perhaps amiga software programmers write with more technical users in mind. As an example, consider the extensive use made of AREXX scripts for program control, customization and inter-process communications in a fully multitasking environment. These features simply aren't available on the Mac or IBM. Also, on no other platform can you find so much high quality shareware as on the Amiga and its multi-gigabyte 'Aminet'. In Amiga Report No. 408 the following statement was made:

These are the reasons why Amiga is a suitable web authoring platform. Although personal bias has also played a role in my decision. Computer users should not feel pressured into any particular platform based on high powered media hype. On the net, content is king, and nobody knows which platform you use.

Unless of course you insist on degrading NiceLongFileName.html into STUPDNAM.HTM :-)

I would be very interested in how other amiga owners use various software/hardware products to construct scientific or academic web pages.

John Talbot,
jpctalbot@yahoo.com

References

Agnes - Advanced Amiga Search


AmiCrawler


Amiga Web Network
Member of the Amiga Web Network.

Internet Link Exchange
Member of the Internet Link Exchange
Laser Stars Home