Atari Diskcomm archive to SIO2PC disk image converter. Version February 1998. Preface. I think this program hardly needs any introduction. Or maybe I am saying this because it is so hard to write one. I wrote this program to solve a problem. I have a relatively large number of Diskcomm archives, and I need to convert them to SIO2PC archives. There are several ways to reach the same goal. There are several similar utilities already in existence. You might ask why I would need to write yet another one. Well, I wanted to be sure that these utilities work the way Diskcomm does. There are very few people that know exactly what it is that Diskcomm does, and only if you do have this knowledge, it is possible to write a utility like this. If you cannot be sure about the utilities, you would be forced to use the original. Since I have the Diskcomm archives on my PC, this involves downloading them all to our XE, converting them to disks again with Diskcomm, and copying them back to the PC with a sector copier and SIO2PC. This sounded like a lot of work. I did do this for a couple of them, just to verify that the resulting disk images were identical to the output of the utilities. Since there were differences, I had to come to the conclusion that these utilities were unreliable. I also concluded that I was not going to convert 500+ Diskcomm files by downloading them. So I was forced to write something myself. Time for some research. I always wanted to know what the specifications for the Diskcomm format were. In the near future, we want to be able to convert all ATR files on our CD to Diskcomm files. We are not sure that there will be a lot of people asking for a CD in this format, but several people have told us that they would not be interested in a CD unless it was in the Diskcomm format. To be able to do this, you need to know the format. As we all know, Bob Puff wrote Diskcomm, so we asked him for the specs. Clear thinking. Unfortunately, he is very busy. And if you write tools like these, you just write the program, without first writing down the specs. You make them up as you go along. So he would have to sit down and write them up. Well, at this point I had several good reasons for wanting to have the format specifications right away. So I had to start a reverse engineering project to figure out the format. This has been done before. Jason Duerstock started to figure it all out by trying to convert various diskettes to archives, and then determined what it was that Diskcomm stored in the archive. He also wrote the first conversion program. He did a nice job, and wrote down some specs. He even shared them with others, so now there are several of these programs. Unfortunately, he did make some minor mistakes. Other people discovered some of them, and corrected these. But there are still some undiscovered problems, as the comparison with the real Diskcomm output proved. I took a different approach. I disassembled the code. Now this is not a simple task, mainly because it costs a lot of time. On the other hand, I have done this a lot, so within several weeks, I sort of had a source code listing, that I could start to interpret. There are a lot of hours in a week if you do not sleep. And even then there are too few. But in the end, you can simply see what it is that Diskcomm does, so at a certain point, you can guarantee that what your program is doing is correct. I discovered that the current version of Diskcomm supports an older version of Diskcomm. It will not create archives in this format anymore itself though. So this is something one would never have been able to find out with other methods of reverse engineering. Purpose of this project. The design goal of this project is to have a way of converting Diskcomm archives to SIO2PC diskette image files, on the system that uses these images. Another goal of this project is to finally document the format of Diskcomm archive files. This opens up the way for more utilities. Theory of operation. People wanted to be able to transmit programs over telephone lines. Almost all communications programs are capable of transmitting files with a modem. However, on the Atari, lots of programs were designed as a bootable disk. The programs on bootable disks cannot be easily transmitted with these communications programs. So there was a need to convert bootable disks to a format that can be transmitted. Diskcomm was written for this purpose. It is designed to convert the data on a diskette into some format that can be transmitted with a communications program. After transmitting it, the data must be written to a diskette again, and thus effectively the contents of the diskette are transported across many miles instantly. That is what tele-communications is all about. So Diskcomm simply reads a disk sector by sector, and stores the data it finds in a big file, that holds the contents of all the sectors on the disk. After this file has been transmitted, it can be written to the diskette at the receiver location. So Diskcomm is in fact really only a sector copier by modem. This sounds simple, but life just isn't simple sometimes. One problem you would encounter is that when DOS stores files on a disk, it adds some overhead to keep track of the data, and the available space on the disk. Now if you are going to copy all sectors on the disk, and add some overhead to this, then it is clear that you will not be able to write a file that big to a single diskette. It is obvious that this will not work. Well, one solution is to cut the file in half. Simply write the first half of the disk to one file, and the second half to another, using another diskette. This means you will have to transmit two files, and at the receiver side, you would need to write the data of both these files to the diskette. This is a bit inconvenient, but not a big problem. Of course, there are other ways of solving this problem. How about figuring out some way of compressing the data? If you can reduce the amount of bytes that is needed to represent the contents of the diskette, you might be able to store all data in one file, and have it fit onto a diskette even with the added overhead. This would make the process of transmitting disks a lot easier. You already figured out that Diskcomm does some compressing of course. The typical Diskcomm archive file is some 10 percent smaller than the size of the disk. This may not seem much, but it is sufficient for the purpose of reducing it by enough to make it fit on a diskette. Remember that the current version of Diskcomm is dated December 1987, so the more elaborate compression algorithms were not invented yet. There are disks that are compressed rather well, so it varies by content. The current version of Diskcomm tries a couple of tactics. All compression is done on a sector basis. It tries to detect repetitions. For instance, if two consecutive sectors have an identical content, the data is not stored twice, but Diskcomm stores an indicator to flag that the contents of the next sector are identical to the previous one. This saves a lot of space in the archive. Or when a sector contains similar bytes, the bytes are replaced by an indicator that represents the number of repetitions. Of course this again adds some overhead, since now instead of simply storing all sectors in a file, you also have to store the compression type applied to that sector. In the end, this extra overhead is negligible compared to the amount of space saved by the compression. Another type of reducing the data is simply leaving sectors out. Most diskettes have at least a couple of sectors that are unused. These sectors contain nothing but zeroes. We can simply leave these out, without losing any data. Of course this means that we have to keep track of what sector to write and what sector to skip when we are writing to the diskette again. All this compressing makes the process a bit more complicated when we want to copy the data back to the diskette. But that is not a problem, if we know what types of compression to expect. Well, we have talked about this issue before. Now what happens if the data on the diskette is such that there are no similar sectors? And if we try the other compression algorithms, what if it is simply impossible to compress the data? Then the added overhead will make the file even grow larger. So it is obvious that we have to combine the two techniques. If we discover that the archive file does not fit onto one diskette, we have to split up the archive into multiple smaller archive files. It is hard to tell up front what to do, so most of the time, we simply rely on the compression, and hope that it will fit. If it doesn't, we will have to start over and create multiple smaller archive files. Most of you already know all this of course. I just wanted to sum this up, so that we all know what we are talking about. As can be expected, to organize all this compressing and splitting up, we need some control information to prevent the user of the Diskcomm program from making mistakes. Some information needs to be added to the archive file to indicate that the archive file was split up into multiple files. Also, some information about the diskette type would be useful, since we might have to format the output diskette before writing the sectors to it. Well, enough of this techno-talk. Since this project had two goals, I will write two separate documents. One for the people that just want to use this utility, and one for the people that are interested in the structure of Diskcomm archives. The other document is included in this package too. It is the source code. If you can read 'C' code, you can see what the format of a Diskcomm archive is. If I have some more time, I might write an English version, but for now the source will have to do. For people that want to run this program on some other platform, you should be able to compile the source on any platform without problems, since it only uses standard 'C' stuff. Well, as usual the Atari ST has some problems with command line stuff, but if you write programs on the ST, you should know how to deal with this. I consider this source code to be public domain. Use whatever you want to write your own utility. I will retain the copyright to the program name, so you will have to give your program a different name. If you want to use this program as a basis for a utility with a completely different purpose this should not be a problem. I just want people to know which program I wrote. If you only want to port it to another system, using the same file name, feel free to do so, and post your results. Please include the documentation in this case. However if you want to change the program substantially, use a different name. If you discover bugs in the PC version, report them to me, so that I will be able to fix them and release a new version. I do have compilers for the Macintosh and the ST, but I have limited time, so let other people join the fun if they have some more time. The output. During the development stage, I had the program print a lot of information to the screen. As usual, printing to the screen makes the information scroll by faster than you can read, so I redirected the output to a file. All the relevant data can then be examined at your own pace. This data is only important if there is a bug in my program, and you want to be able to fix it. Well, if the archive has been damaged, you might also be able to determine what is wrong with the file. Not that this would help you a lot though. If you want to see this diagnostic data, you can make the program print it by using the command line switch or by responding with yes if the program asks for this option. When the conversion process is complete, the program will print statistics about the Diskcomm file processed, if so requested. The program prints the number of headers encountered. For archives that have been created as one large file, we will encounter one or more FA type headers. For archives that have been created as multiple smaller files, we will encounter several F9 type headers. For each type of compressed sector the program counts the number of times that this type of compression occurs in the archive. This is printed, as is the total number of compressed sectors. Sectors that are completely filled with zeroes are not stored in the archive, so they cannot be counted. We do know how many sectors we created in the output file, so this can be printed. If we subtract the total count from the total number of sectors, we know how many sectors were empty. All this information is not really important, but it is fun to know. The count of the flush buffer codes should be equal to the count of headers encountered. Again, this is not really important. Command line options. Running the DCM2ATR program is easy. You have two options. You can run it interactively, by simply starting it. It will ask for the filename of the Diskcomm archive file. Once that is entered, it will ask you to enter the filename for the SIO2PC disk image file to be created. Next, you can specify whether or not you want the diagnostics to be printed. If you wish this data to be printed, it will be printed to the screen. The output can be redirected to a file using the standard DOS redirection, but that would redirect the prompts to the redirected output too, so this is only recommended if you use the command line arguments. You are then asked to enter whether or not you want to see the statistics for the Diskcomm archive. This will produce a list of the number of blocks encountered for each compression type. Then you are asked to enter the highest sector number. If this is a standard diskette, simply press return. Otherwise enter the number of sectors that are available on the type of diskette you are processing. You can enter all this on the command line if you prefer. The printing of diagnostic data is an option switch, which can be selected by adding /d to the command. The printing of the statistics can be selected by adding /s to the command. The Diskcomm archive file to process is the first argument on the command line. The filename for the SIO2PC disk image file is optional, and if it is present, it is the second argument. If it is omitted, the input file is assumed to have a .DCM extension. This extension will be replaced by the .ATR extension. The resulting filename is then used to create the SIO2PC disk image file. So if we want to process a file named demo.dcm in the \xl_dcm directory, we would enter: dcm2atr /d /s \xl_dcm\demo.dcm \xl_atr\demo.atr > demo.txt This would select the printing of diagnostic data, redirecting it to the file demo.txt in the current directory. After successful conversion, statistics about the conversion process are printed. Since all output is redirected, this too will be stored in the file demo.txt. If the Diskcomm archive file is created from a disk of non-standard size, we will have to enter the highest sector number to expect. This is option H in the Diskcomm program. For this program, it is only used to append empty sectors to the end of the disk image. If the original disk contained 1440 sectors, the Diskcomm archive will simply contain that many sectors, unless the last sectors contain zeroes. For most applications, this is not a problem. If the programs on the disk do not need these sectors, and you are not going to write to the disk image, you do not need these extra sectors. On the other hand, if you want to be able to make modifications, you have to have these sectors available. In this case, you must tell this program what the highest sector number is, using the /h command switch. After the /h, you have to specify the number in decimal notation. For 1440 sectors, you would have to enter /h=1440. The program will then make sure that there are at least 1440 sectors in the file. This option cannot be used to truncate a file, since we will always process all the sectors that are in the Diskcomm archive. It only tells the program the minimum amount of sectors. This option is not needed for archive files created from standard diskettes. For a normal single density disk, it is assumed that it contains 720 sectors. Enhanced density disks are assumed to contain 1040 sectors. Double density disks are assumed to have 720 sectors. Note that on all Atari double density disks, the first three sectors are always only 128 bytes, while the rest contain 256 bytes. This is only important if you compute the size of the SIO2PC disk image file. The program does this for you, and it stores this information in the header of the disk image. If you want to overrule the standard sizes, to save disk space for instance, you could use this to force this program to not append any unused sectors at the end by specifying a value of 1. So this might truncate the disk image a little. So I lied. Anyway, you can only control the number of sectors that contain zeroes. Note that adding extra sectors beyond the original number of sectors of the disk does not make more space available on the disk image, since the VTOC and other DOS information on the disk is not modified accordingly. If your program handles all sector I/O itself though, this can be used to make the disk bigger. But you can create a large disk with SIO2PC itself, so you do not need this program for that. Multi file archives. Like we already discussed, some disks need to be split into several smaller archive files. This program is kept relatively simple, so it simply does not handle multi file archives the way Diskcomm does. I could have added that, but that would only cost me extra time. The current version of this program does convert the Diskcomm archive files properly, so I reached my goal. Other people are very welcome to enhance this program, I do not have a need for more functionality. Besides, I also do not have the time for it. We have a CD to finish. When Diskcomm converts an archive to a diskette, it will know that this is a multi file archive, because the first byte will contain F9. It will then ask what byte of the filename is to be incremented when the end of the file has been reached. To keep this program simple, we will assume that the last byte of the file name portion will be incremented. This must be a "1" or an "A". It will scan the filename until it finds either a "1" or an "A". When the program cannot find either of these, the last byte before the extension is incremented. This byte will then be incremented for each file that is processed. If there is only one file, no incrementing is needed of course. You can convert a multi file archive into a single file archive by merging all parts of the archive. This is very easy, most systems have some way of concatenating multiple files into one file. On a DOS PC, this can be done with the copy command. Suppose you have an archive that consists of three files: demo1.dcm, demo2.dcm and demo3.dcm. To merge these files enter: copy /B demo1.dcm + demo2.dcm + demo3.dcm demo.dcm to create a single file archive. The /B switch is required to make DOS treat the file as a binary file, so that it does not treat the hexadecimal value 1A as an end of file marker. You could try using wild cards: copy /B demo?.dcm demo.dcm but this will give you some weird error message. That is because this will cause the resulting demo.dcm file to be concatenated too. So you would have to give it a different name, or specify a different directory. But even then, if the files are not concatenated in the proper order, you will not be able to convert the archive into a disk again. DOS does not sort files by filename, so using wild cards is not recommended. There is a big chance that the files will not be processed in the proper order. If you do use wild cards, make sure you pay attention to the sequence in which the files are processed. Enter all files in the proper order on the command line for copy if it does not copy the files in the proper order. Refer to your system's manual if you need more information. Troubleshooting. If a Diskcomm archive cannot be converted to a SIO2PC disk image, there are very few things you can do about it. Maybe the file was not properly uploaded or downloaded. Sometimes, uploading or downloading a file may add extra bytes to a file. The program tries to detect this. After a flush buffer code, the program starts looking for a header byte again. Any junk that is found is simply discarded. This should take care of the problem. However, if the file is damaged, because some sort of transmission or conversion error, we will most likely not be able to use it. Some errors might occur with multi file archives. When the end of a file has been reached, the program will attempt to open another file, if the last pass of the archive has not been processed yet. So the current Diskcomm archive file is closed, the character that we decided to use to increment is then incremented, and we will try to open the next part of the archive. All sorts of things can go wrong here, but these problems are easy to diagnose. First off, the next file must be present. If the character that we should be incrementing is not the last character in the filename, as described before, then this program will not be able to process this set of files. Either merge the archive files, or rename the files such that there is a "1" or an "A" in the last position. You would have to rename all portions accordingly of course. If for some reason the sequence of the files is not correct, the program will tell you that the pass number is out of sequence. Check and correct the filenames. If you merged a multi file archive, you might have merged them in an incorrect order, possibly because you used wild cards. If you merged the archive files, processing can continue without closing and opening of the next file of course. I cannot think of any more problems. If you are reading this, it might mean you now have a problem that I have no solution for. Who else would read the documentation? You might try getting help from the newsgroup, or send me an E-mail. Epilog. Well, reading the manual wasn't that hard now was it? I have stated that some of the other similar utilities that exist in their current version will not always convert a Diskcomm archive properly. I am not accusing anyone. I just want you all to know that if you have tried utilities like this before, that the disk images that were produced might have been faulty. If this lead you to believe that this sort of thing does not work, you might want to try it again. What I am more concerned about though is that if you used these utilities before, you might now have several disk image files that are corrupted. If you still have the Diskcomm files, you should probably convert them all over again, to make sure that the disk image contains the data that it is supposed to. Well, that is about it. Thanks to Bob Puff for inventing Diskcomm. Thanks to Nick Kennedy for inventing SIO2PC. Thanks to Jason Duerstock and all others that thought of bridging the gap, like Bobterm says, in more than one way. This includes Steven Tucker for inventing Imagic and APE, thanks for updating your stuff too, for people that do not know how to handle a command line. Thanks to you for still caring. This program is totally free. If you want to get rid of some money, the afore mentioned people worked hard to make this all possible, so maybe it is time to pay that shareware fee. Keep those XL's/XE's humming. Ernest R. Schreurs Kempenlandstraat 8 5211 VN Den Bosch The Netherlands E-mail: ernest@wxs.nl