Open main menu
Home
Random
Recent changes
Special pages
Community portal
Preferences
About Wikipedia
Disclaimers
Incubator escapee wiki
Search
User menu
Talk
Dark mode
Contributions
Create account
Log in
Editing
COM file
(section)
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
==DOS binary format== The COM format is the original binary executable format used in [[CP/M]] (including [[SCP (operating system)|SCP]] and [[MSX-DOS]]) as well as [[DOS]]. It is very simple; it has no header (with the exception of CP/M 3 files),<ref name="Elliott_1998"/> and contains no standard [[metadata]], only code and data. This simplicity exacts a price: the [[object code|binary]] has a maximum size of 65,280 (FF00[[hexadecimal|h]]) bytes (256 bytes short of 64 KiB) and stores all its [[Code segment|code]] and [[Data segment|data]] in one [[x86 memory segmentation|segment]].<!-- Actually, this is not true, most DOS issues support COM programs larger than 64 KiB for as long as they fit into the TPA. However, there are slight differences and a number of bugs in some versions of DOS which may cause problems when loading COM files larger than 64 KiB. --> Since it lacks [[Relocation (computing)|relocation]] information, it is [[loader (computing)|loaded]] by the operating system at a pre-set address, at offset 0100h immediately following the [[program segment prefix|PSP]], where it is executed (hence the limitation of the executable's size): the [[entry point]] is fixed at 0100h.<ref group="nb" name="NB_ORG"/> This was not an issue on 8-bit machines since they can only address 64k of memory, but 16-bit machines have a much larger address space, which is why the format fell out of use. In the [[Intel 8080]] CPU architecture, only 65,536 bytes of memory could be addressed (address range 0000h to FFFFh). Under CP/M, the first 256 bytes of this memory, from 0000h to 00FFh were reserved for system use by the [[zero page]], and any user program had to be loaded at exactly 0100h to be executed.<ref group="nb" name="NB_ORG"/> COM files fit this model perfectly. Before the introduction of [[MP/M]] and [[Concurrent CP/M]], there was no possibility of running more than one program or command at a time: the program loaded at 0100h was run, and no other. Although the file format is the same in DOS and CP/M, .COM files for the two operating systems are not compatible; DOS COM files contain [[x86]] instructions and possibly DOS [[system call]]s, while CP/M COM files contain [[Intel 8080|8080]] instructions and CP/M system calls (programs restricted to certain machines could also contain additional instructions for [[Intel 8085|8085]] or [[Zilog Z80|Z80]]). .COM files in DOS set all x86 segment registers to the same value and the SP (stack pointer) register to the offset of the last word available in the first 64 KiB segment (typically FFFEh) or the maximum size of memory available in the block the program is loaded into for both, the program plus at least 256 bytes stack, whatever is smaller, thus the stack begins at the very top of the corresponding memory segment and works down from there.<ref name="Paul_2002_COM"/><ref name="FYS_2020"/> In the original DOS 1.x [[DOS API|API]], which was a derivative of the CP/M API, program termination of a .COM file would be performed by calling the INT 20h (Terminate Program) function or else INT 21h Function 0, which served the same purpose, and the programmer also had to ensure that the code and data segment registers contained the same value at program termination to avoid a potential system crash. Although this could be used in any DOS version, Microsoft recommended the use of INT 21h Function 4Ch for program termination from DOS 2.x onward, which did not require the data and code segment to be set to the same value. It is possible to make a .COM file to run under both operating systems in form of a [[fat binary]]. There is no true compatibility at the instruction level; the instructions at the [[entry point]] are chosen to be equal in functionality but different in both operating systems, and make program execution jump to the section for the operating system in use. It is basically two different programs with the same functionality in a single file, preceded by code selecting the one to use. Under CP/M 3, if the first byte of a COM file is C9h, there is a 256-byte header;<ref name="Elliott_1998"/> since C9h corresponds to the [[Intel 8080|8080]] instruction <code>RET</code>, this means that the COM file will immediately terminate if run on an earlier version of CP/M that does not support this extension. (Because the instruction sets of the 8085 and Z80 are supersets of the 8080 instruction set, this works on all three processors.) C9h is an [[invalid opcode]] on the 8088/8086, and it will cause a processor-generated interrupt 6 exception in [[v86 mode]] on the [[Intel 80386|386]] and later x86 chips. Since C9h is the opcode for LEAVE since the [[Intel 80188|80188]]/[[Intel 80186|80186]] and therefore not used as the first instruction in a valid program, the executable loader in some versions of DOS rejects COM files that start with C9h, avoiding a crash. Files may have names ending in .COM, but not be in the simple format described above; this is indicated by a [[magic number (programming)|magic number]] at the start of the file. For example, the [[COMMAND.COM]] file in [[DR DOS 6.0]] is actually in [[DOS executable]] format, indicated by the first two bytes being ''MZ'' (4Dh 5Ah), the initials of [[Mark Zbikowski]].
Edit summary
(Briefly describe your changes)
By publishing changes, you agree to the
Terms of Use
, and you irrevocably agree to release your contribution under the
CC BY-SA 4.0 License
and the
GFDL
. You agree that a hyperlink or URL is sufficient attribution under the Creative Commons license.
Cancel
Editing help
(opens in new window)