Open main menu
Home
Random
Recent changes
Special pages
Community portal
Preferences
About Wikipedia
Disclaimers
Incubator escapee wiki
Search
User menu
Talk
Dark mode
Contributions
Create account
Log in
Editing
Self-modifying code
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
{{Short description|Source code that alters its instructions to the hardware while executing}} {{More citations needed|date=April 2009}} {{Use dmy dates|date=January 2022|cs1-dates=y}} {{Use list-defined references|date=December 2021}} {{Use American English|date=January 2019}} In [[computer science]], '''self-modifying code''' ('''SMC''' or '''SMoC''') is [[source code|code]] that alters its own [[instruction (computer science)|instruction]]s while it is [[execution (computing)|executing]] – usually to reduce the [[instruction path length]] and improve [[computer performance|performance]] or simply to reduce otherwise [[duplicate code|repetitively similar code]], thus simplifying [[software maintenance|maintenance]]. The term is usually only applied to code where the self-modification is intentional, not in situations where code accidentally modifies itself due to an error such as a [[buffer overflow]]. Self-modifying code can involve overwriting existing instructions or generating new code at run time and transferring control to that code. Self-modification can be used as an alternative to the method of "flag setting" and conditional program branching, used primarily to reduce the number of times a condition needs to be tested. The method is frequently used for conditionally invoking [[test/debugging]] code without requiring additional [[computational overhead]] for every [[input/output]] cycle. The modifications may be performed: * '''only during initialization''' – based on input [[Parameter#Computing|parameter]]s (when the process is more commonly described as software '[[computer configuration|configuration]]' and is somewhat analogous, in hardware terms, to setting [[jumper (computing)|jumper]]s for [[printed circuit board]]s). Alteration of program entry [[pointer (computer programming)|pointer]]s is an equivalent indirect method of self-modification, but requiring the co-existence of one or more alternative instruction paths, increasing the [[binary file|program size]]. * '''throughout execution''' ("on the fly") – based on particular program states that have been reached during the execution In either case, the modifications may be performed directly to the [[machine code]] instructions themselves, by [[overlapping instructions|overlaying]] new instructions over the existing ones (for example: altering a compare and branch to an [[unconditional branch]] or alternatively a '[[NOP (code)|NOP]]'). In the [[IBM System/360 architecture]], and its successors up to [[z/Architecture]], an EXECUTE (EX) instruction ''logically'' overlays the second byte of its target instruction with the low-order 8 bits of [[general-purpose register|register]] 1. This provides the effect of self-modification although the actual instruction in storage is not altered. ==Application in low and high level languages== Self-modification can be accomplished in a variety of ways depending upon the programming language and its support for pointers and/or access to dynamic compiler or interpreter 'engines': * '''overlay of existing instructions''' (or parts of instructions such as opcode, register, flags or addresses) or * '''direct creation of whole instructions''' or sequences of instructions in memory * '''creation or modification of [[source code]] statements''' followed by a 'mini compile' or a dynamic interpretation (see [[eval]] statement) * '''creating an entire program dynamically''' and then executing it ===Assembly language=== Self-modifying code is quite straightforward to implement when using [[assembly language]]. Instructions can be dynamically created in [[computer memory|memory]] (or else overlaid over existing code in non-protected program storage),<ref name="HP9100A_1998"/> in a sequence equivalent to the ones that a standard compiler may generate as the [[object code]]. With modern processors, there can be unintended [[side effect (computer science)|side effect]]s on the [[CPU cache]] that must be considered. The method was frequently used for testing 'first time' conditions, as in this suitably commented [[IBM/360]] [[assembler (computer programming)|assembler]] example. It uses instruction overlay to reduce the [[instruction path length]] by (N×1)−1 where N is the number of records on the file (−1 being the [[computational overhead|overhead]] to perform the overlay). SUBRTN NOP OPENED FIRST TIME HERE? * The NOP is x'4700'<Address_of_opened> OI SUBRTN+1,X'F0' YES, CHANGE NOP TO UNCONDITIONAL BRANCH (47F0...) OPEN INPUT AND OPEN THE INPUT FILE SINCE IT'S THE FIRST TIME THRU OPENED GET INPUT NORMAL PROCESSING RESUMES HERE ... Alternative code might involve testing a "flag" each time through. The unconditional branch is slightly faster than a compare instruction, as well as reducing the overall path length. In later operating systems for programs residing in [[memory protection|protected storage]] this technique could not be used and so changing the pointer to the [[subroutine]] would be used instead. The pointer would reside in [[dynamic storage]] and could be altered at will after the first pass to bypass the OPEN (having to load a pointer first instead of a direct branch & link to the subroutine would add N instructions to the path length – but there would be a corresponding reduction of N for the unconditional branch that would no longer be required). Below is an example in [[Zilog Z80]] assembly language. The code increments register "B" in range [0,5]. The "CP" compare instruction is modified on each loop. <syntaxhighlight lang="nasm"> ;========== ORG 0H CALL FUNC00 HALT ;========== FUNC00: LD A,6 LD HL,label01+1 LD B,(HL) label00: INC B LD (HL),B label01: CP $0 JP NZ,label00 RET ;========== </syntaxhighlight> Self-modifying code is sometimes used to overcome limitations in a machine's instruction set. For example, in the [[Intel 8080]] instruction set, one cannot input a byte from an input port that is specified in a register. The input port is statically encoded in the instruction itself, as the second byte of a two byte instruction. Using self-modifying code, it is possible to store a register's contents into the second byte of the instruction, then execute the modified instruction in order to achieve the desired effect. ===High-level languages=== Some compiled languages explicitly permit self-modifying code. For example, the ALTER verb in [[COBOL]] may be implemented as a branch instruction that is modified during execution.<ref name="MicroFocus_ALTER"/> Some [[batch file|batch]] programming techniques involve the use of self-modifying code. [[Clipper (programming language)|Clipper]] and [[SPITBOL]] also provide facilities for explicit self-modification. The Algol compiler on [[Burroughs Large Systems|B6700 system]]s offered an interface to the operating system whereby executing code could pass a text string or a named disc file to the Algol compiler and was then able to invoke the new version of a procedure. With interpreted languages, the "machine code" is the source text and may be susceptible to editing on-the-fly: in [[SNOBOL]] the source statements being executed are elements of a text array. Other languages, such as [[Perl]] and [[Python (programming language)|Python]], allow programs to create new code at run-time and execute it using an [[eval]] function, but do not allow existing code to be mutated. The illusion of modification (even though no machine code is really being overwritten) is achieved by modifying function pointers, as in this JavaScript example: <syntaxhighlight lang="JavaScript"> var f = function (x) {return x + 1}; // assign a new definition to f: f = new Function('x', 'return x + 2'); </syntaxhighlight> [[Lisp macro]]s also allow runtime code generation without parsing a string containing program code. The Push programming language is a [[genetic programming]] system that is explicitly designed for creating self-modifying programs. While not a high level language, it is not as low level as assembly language.<ref name="Push"/> ====Compound modification==== Prior to the advent of multiple windows, command-line systems might offer a menu system involving the modification of a running command script. Suppose a [[DOS]] script (or "batch") file MENU.BAT contains the following:<ref name="Fosdal_2001"/><ref group="nb" name="NB_CHOICE"/> :start SHOWMENU.EXE Upon initiation of MENU.BAT from the command line, SHOWMENU presents an on-screen menu, with possible help information, example usages and so forth. Eventually the user makes a selection that requires a command ''SOMENAME'' to be performed: SHOWMENU exits after rewriting the file MENU.BAT to contain :start SHOWMENU.EXE CALL ''SOMENAME''.BAT GOTO start Because the DOS command interpreter does not compile a script file and then execute it, nor does it read the entire file into memory before starting execution, nor yet rely on the content of a record buffer, when SHOWMENU exits, the command interpreter finds a new command to execute (it is to invoke the script file ''SOMENAME'', in a directory location and via a protocol known to SHOWMENU), and after that command completes, it goes back to the start of the script file and reactivates SHOWMENU ready for the next selection. Should the menu choice be to quit, the file would be rewritten back to its original state. Although this starting state has no use for the label, it, or an equivalent amount of text is required, because the DOS command interpreter recalls the byte position of the next command when it is to start the next command, thus the re-written file must maintain alignment for the next command start point to indeed be the start of the next command. Aside from the convenience of a menu system (and possible auxiliary features), this scheme means that the SHOWMENU.EXE system is not in memory when the selected command is activated, a significant advantage when memory is limited.<ref name="Fosdal_2001"/><ref name="Paul_1996"/> ===Control tables=== [[Control table]] [[interpreter (computing)|interpreter]]s can be considered to be, in one sense, 'self-modified' by data values extracted from the table entries (rather than specifically [[hand coding|hand coded]] in [[conditional (computer programming)|conditional statement]]s of the form "IF inputx = 'yyy'"). ===Channel programs=== Some IBM [[access method]]s traditionally used self-modifying [[Channel I/O#Channel program|channel program]]s, where a value, such as a disk address, is read into an area referenced by a channel program, where it is used by a later channel command to access the disk. ==History== The [[IBM SSEC]], demonstrated in January 1948, had the ability to modify its instructions or otherwise treat them exactly like data. However, the capability was rarely used in practice.<ref name="Bashe-Buchholz-Hawkins-Ingram-Rochester_1981"/> In the early days of computers, self-modifying code was often used to reduce use of limited memory, or improve performance, or both. It was also sometimes used to implement subroutine calls and returns when the instruction set only provided simple branching or skipping instructions to vary the [[control flow]].<ref name="Miller_2006"/><ref name="Wenzl-Merzdovnik-Ullrich-Weippl_2019"/> This use is still relevant in certain ultra-[[Reduced instruction set computer|RISC]] architectures, at least theoretically; see for example [[one-instruction set computer]]. [[Donald Knuth]]'s [[MIX (abstract machine)|MIX]] architecture also used self-modifying code to implement subroutine calls.<ref name="Knuth_MMIX"/> ==Usage== Self-modifying code can be used for various purposes: * Semi-automatic [[Program optimization|optimizing]] of a state-dependent loop. * Dynamic in-place code optimization for speed depending on load environment.<ref name="Caldera_1997_DOSSRC"/><ref name="Paul_1997_OD-A3"/><ref group="nb" name="NB_DR-DOS_386"/> * [[Run time (program lifecycle phase)|Run-time]] code generation, or specialization of an algorithm in runtime or loadtime (which is popular, for example, in the domain of real-time graphics) such as a general sort utility – preparing code to perform the key comparison described in a specific invocation. * Altering of [[inline function|inlined]] state of an [[object (computer science)|object]], or simulating the high-level construction of [[closure (computer programming)|closure]]s. * Patching of [[subroutine]] ([[pointer (computer programming)|pointer]]) address calling, usually as performed at load/initialization time of [[dynamic libraries]], or else on each invocation, patching the subroutine's internal references to its parameters so as to use their actual addresses (i.e. indirect self-modification). * Evolutionary computing systems such as [[neuroevolution]], [[genetic programming]] and other [[evolutionary algorithm]]s. * Hiding of code to prevent [[reverse engineering]] (by use of a [[disassembler]] or [[debugger]]) or to evade detection by virus/spyware scanning software and the like. * Filling 100% of memory (in some architectures) with a rolling pattern of repeating [[opcode]]s, to erase all programs and data, or to [[burn-in]] hardware or perform [[RAM test]]s.<ref name="Wilkinson_1996"/> * [[Executable compression|Compressing]] code to be decompressed and executed at runtime, e.g., when memory or disk space is limited.<ref name="Caldera_1997_DOSSRC"/><ref name="Paul_1997_OD-A3"/> * Some very limited [[instruction set architecture|instruction set]]s leave no option but to use self-modifying code to perform certain functions. For example, a [[one-instruction set computer]] (OISC) machine that uses only the subtract-and-branch-if-negative "instruction" cannot do an indirect copy (something like the equivalent of "*a = **b" in the [[C (programming language)|C language]]) without using self-modifying code. * [[Booting]]. Early [[microcomputer]]s often used self-modifying code in their bootloaders. Since the bootloader was keyed in via the front panel at every power-on, it did not matter if the [[bootloader]] modified itself. However, even today many bootstrap loaders are [[self-relocating]], and a few are even self-modifying.<ref group="nb" name="NB_DR-DOS_707"/> * Altering instructions for fault-tolerance.<ref name="Ortiz_2015"/> ===Optimizing a state-dependent loop=== [[Pseudocode]] example: repeat ''N'' times { if STATE is 1 increase A by one else decrease A by one do something with A } Self-modifying code, in this case, would simply be a matter of rewriting the loop like this: repeat ''N'' times { ''increase'' A by one do something with A when STATE has to switch { replace the opcode "increase" above with the opcode to decrease, or vice versa } } Note that two-state replacement of the [[opcode]] can be easily written as 'xor var at address with the value "opcodeOf(Inc) xor opcodeOf(dec)"'. Choosing this solution must depend on the value of {{Var|N}} and the frequency of state changing. ===Specialization=== Suppose a set of statistics such as average, extrema, location of extrema, standard deviation, etc. are to be calculated for some large data set. In a general situation, there may be an option of associating weights with the data, so each x<sub>i</sub> is associated with a w<sub>i</sub> and rather than test for the presence of weights at every index value, there could be two versions of the calculation, one for use with weights and one not, with one test at the start. Now consider a further option, that each value may have associated with it a Boolean to signify whether that value is to be skipped or not. This could be handled by producing four batches of code, one for each permutation and code bloat results. Alternatively, the weight and the skip arrays could be merged into a temporary array (with zero weights for values to be skipped), at the cost of processing and still there is bloat. However, with code modification, to the template for calculating the statistics could be added as appropriate the code for skipping unwanted values, and for applying weights. There would be no repeated testing of the options and the data array would be accessed once, as also would the weight and skip arrays, if involved. ===Use as camouflage=== Self-modifying code is more complex to analyze than standard code and can therefore be used as a protection against [[reverse engineering]] and [[software cracking]]. Self-modifying code was used to hide copy protection instructions in 1980s disk-based programs for systems such as [[IBM PC compatible]]s and [[Apple II]]. For example, on an IBM PC, the [[floppy disk]] drive access instruction <code>[[int 0x13]]</code> would not appear in the executable program's image but it would be written into the executable's memory image after the program started executing. Self-modifying code is also sometimes used by programs that do not want to reveal their presence, such as [[computer virus]]es and some [[shellcode]]s. Viruses and shellcodes that use self-modifying code mostly do this in combination with [[polymorphic code]]. Modifying a piece of running code is also used in certain attacks, such as [[buffer overflow]]s. ===Self-referential machine learning systems=== Traditional [[machine learning]] systems have a fixed, pre-programmed learning [[algorithm]] to adjust their [[parameter (computer programming)|parameter]]s. However, since the 1980s [[Jürgen Schmidhuber]] has published several self-modifying systems with the ability to change their own learning algorithm. They avoid the danger of catastrophic self-rewrites by making sure that self-modifications will survive only if they are useful according to a user-given [[fitness function|fitness]], [[error function|error]] or [[reward function|reward]] function.<ref name="Schmidhuber"/> ===Operating systems=== The [[Linux kernel]] notably makes wide use of self-modifying code; it does so to be able to distribute a single binary image for each major architecture (e.g. [[IA-32]], [[x86-64]], 32-bit [[ARM architecture family|ARM]], [[ARM64]]...) while adapting the kernel code in memory during boot depending on the specific CPU model detected, e.g. to be able to take advantage of new CPU instructions or to work around hardware bugs.<ref name="linux_self_modifying_Paltsev">{{cite web |author-last=Paltsev |author-first=Evgeniy |title=Self Modifying Code in Linux Kernel - What, Where and How |date=2020-01-30 |url=https://talk.telematika.org/2019/all/self_modifying_code_in_linux_kernel_-_what_where_and_how/ |access-date=2022-11-27}}</ref><ref name="linux_self_modifying_altinstructions">{{cite web |author-last=Wieczorkiewicz |author-first=Pawel |title=Linux Kernel Alternatives |url=https://grsecurity.net/linux_kernel_alternatives |access-date=2022-11-27}}</ref> To a lesser extent, the [[DR-DOS]] kernel also optimizes speed-critical sections of itself at loadtime depending on the underlying processor generation.<ref name="Caldera_1997_DOSSRC"/><ref name="Paul_1997_OD-A3"/><ref group="nb" name="NB_DR-DOS_386"/> Regardless, at a [[meta-level]], programs can still modify their own behavior by changing data stored elsewhere (see [[metaprogramming]]) or via use of [[type polymorphism|polymorphism]]. ==={{anchor|Synthesis}}Massalin's Synthesis kernel=== The Synthesis [[kernel (operating system)|kernel]] presented in [[Alexia Massalin]]'s [[Doctor of Philosophy|Ph.D.]] thesis<ref name="Massalin_1992_Synthesis"/><ref name="Henson_2008"/> is a tiny [[Unix]] kernel that takes a [[structured programming|structured]], or even [[object-oriented programming|object oriented]], approach to self-modifying code, where code is created for individual [[quaject]]s, like filehandles. Generating code for specific tasks allows the Synthesis kernel to (as a JIT interpreter might) apply a number of [[Compiler optimization|optimization]]s such as [[constant folding]] or [[common subexpression elimination]].<!-- Anyone want to go read the thesis and see what other optimizations Massalin lists? --> The Synthesis kernel was very fast, but was written entirely in assembly. The resulting lack of portability has prevented Massalin's optimization ideas from being adopted by any production kernel. However, the structure of the techniques suggests that they could be captured by a higher level [[programming language|language]], albeit one more complex than existing mid-level languages. Such a language and compiler could allow development of faster operating systems and applications. [[Paul Haeberli]] and Bruce Karsh have objected to the "marginalization" of self-modifying code, and optimization in general, in favor of reduced development costs.<ref name="Haeberli_1994_GraficaObscura"/> ==Interaction of cache and self-modifying code== On architectures without coupled data and instruction cache (for example, some [[SPARC]], ARM, and [[MIPS architecture|MIPS]] cores) the cache synchronization must be explicitly performed by the modifying code (flush data cache and invalidate instruction cache for the modified memory area). In some cases short sections of self-modifying code execute more slowly on modern processors. This is because a modern processor will usually try to keep blocks of code in its cache memory. Each time the program rewrites a part of itself, the rewritten part must be loaded into the cache again, which results in a slight delay, if the modified [[codelet]] shares the same cache line with the modifying code, as is the case when the modified memory address is located within a few bytes to the one of the modifying code. The cache invalidation issue on modern processors usually means that self-modifying code would still be faster only when the modification will occur rarely, such as in the case of a state switching inside an inner loop.{{Citation needed|date=March 2008}} Most modern processors load the machine code before they execute it, which means that if an instruction that is too near the [[instruction pointer]] is modified, the processor will not notice, but instead execute the code as it was ''before'' it was modified. See [[prefetch input queue]] (PIQ). PC processors must handle self-modifying code correctly for backwards compatibility reasons but they are far from efficient at doing so.{{Citation needed|date=March 2008}} ==Security issues== Because of the security implications of self-modifying code, all of the major [[operating system]]s are careful to remove such vulnerabilities as they become known. The concern is typically not that programs will intentionally modify themselves, but that they could be maliciously changed by an [[exploit (computer security)|exploit]]. One mechanism for preventing malicious code modification is an operating system feature called [[W^X]] (for "write [[xor]] execute"). This mechanism prohibits a program from making any page of memory both writable and executable. Some systems prevent a writable page from ever being changed to be executable, even if write permission is removed.{{citation needed|date=May 2022}} Other systems provide a '[[backdoor (computing)|back door]]' of sorts, allowing multiple mappings of a page of memory to have different permissions. A relatively portable way to bypass W^X is to create a file with all permissions, then map the file into memory twice. On Linux, one may use an undocumented SysV shared memory flag to get executable shared memory without needing to create a file.{{Citation needed|date=August 2018}} ==Advantages== * [[Fast path]]s can be established for a program's execution, reducing some otherwise repetitive [[conditional branch]]es. * Self-modifying code can improve [[algorithmic efficiency]]. ==Disadvantages== Self-modifying code is harder to read and maintain because the instructions in the source program listing are not necessarily the instructions that will be executed. Self-modification that consists of substitution of [[function pointer]]s might not be as cryptic, if it is clear that the names of functions to be called are placeholders for functions to be identified later. Self-modifying code can be rewritten as code that tests a [[flag (programming)|flag]] and branches to alternative sequences based on the outcome of the test, but self-modifying code typically runs faster. Self-modifying code conflicts with authentication of the code and may require exceptions to policies requiring that all code running on a system be signed. Modified code must be stored separately from its original form, conflicting with memory management solutions that normally discard the code in RAM and reload it from the executable file as needed. On modern processors with an [[instruction pipelining|instruction pipeline]], code that modifies itself frequently may run more slowly, if it modifies instructions that the processor has already read from memory into the pipeline. On some such processors, the only way to ensure that the modified instructions are executed correctly is to flush the pipeline and reread many instructions. Self-modifying code cannot be used at all in some environments, such as the following: * Application software running under an operating system with strict W^X security cannot execute instructions in pages it is allowed to write to—only the operating system is allowed to both write instructions to memory and later execute those instructions. * Many [[Harvard architecture]] [[microcontroller]]s cannot execute instructions in read-write memory, but only instructions in memory that it cannot write to, ROM or non-self-programmable [[flash memory]]. * A multithreaded application may have several threads executing the same section of self-modifying code, possibly resulting in computation errors and application failures. ==See also== * [[Overlapping code]] * [[Polymorphic code]] * [[Polymorphic engine]] * [[Persistent data structure]] * [[AARD code]] * [[Algorithmic efficiency]] * [[Data as code]] * [[eval]] statement * [[IBM 1130#Code modification|IBM 1130]] (Example) * [[Just-in-time compilation]]: This technique can often give users many of the benefits of self-modifying code (except memory size) without the disadvantages. * [[Dynamic dead code elimination]] * [[Homoiconicity]] * [[PCASTL]] * [[Quine (computing)]] * [[Self-replication]] * [[Reflective programming]] * [[Monkey patch]]: a modification to runtime code that does not affect a program's original source code * [[Extensible programming]]: a programming paradigm in which a programming language can modify its own syntax * [[Self-modifying computer virus]] * [[Self-hosting (compilers)|Self-hosting]] * [[Synthetic programming]] * [[Compiler bootstrapping]] * [[Patchable microcode]] ==Notes== {{reflist|group="nb"|refs= {{r|group="nb"|name="NB_DR-DOS_386"|r=For example, when running on [[i386|386]] or higher processors, later [[Novell DOS 7]] updates<!-- not the early updates, and not OpenDOS 7.01 --> as well as [[DR-DOS 7.02]]<!-- actually since Matthias R. Paul's Alpha 1 --> and higher will dynamically replace some default sequences of 16-bit <code>REP MOVSW</code> ("copy words") instructions in the kernel's runtime image by 32-bit <code>REP MOVSD</code> ("copy double-words") instructions when copying data from one memory location to another (and half the count of necessary repetitions) in order to speed up disk data transfers. [[Edge case]]s such as odd counts are taken care of.<ref name="Caldera_1997_DOSSRC"/><ref name="Paul_1997_OD-A3"/>}} {{r|group="nb"|name="NB_DR-DOS_707"|r=As an example, the [[DR-DOS]] [[master boot record|MBR]]s and [[volume boot record|boot sector]]s (which also hold the [[partition table]] and [[BIOS Parameter Block]], leaving less than 446<!-- MBR --> respectively 423<!-- 512-87-2 (ignoring the 3-byte-jump which can be counted as code) in the case of FAT32, a bit more with FAT12/FAT16 --> bytes for the code) were traditionally able to locate the boot file in the [[FAT12]] or [[FAT16]] file system by themselves and load it into memory as a whole, in contrast to their [[MS-DOS]]/[[PC DOS]] counterparts, which instead relied on the system files to occupy the first two directory entries in the file system and the first three sectors of [[IBMBIO.COM]] to be stored at the start of the data area in contiguous sectors containing a secondary loader to load the remainder of the file into memory (requiring [[SYS (DOS command)|SYS]] to take care of all these conditions). When [[FAT32]] and [[logical block addressing|LBA]] support was added, [[Microsoft]] even switched to require [[Intel 80386|386]] instructions and split the boot code over two sectors for size reasons, which was not an option for DR-DOS as it would have broken [[backward compatibility|backward]]- and cross-compatibility with other operating systems in [[multi-boot]] and [[chain load]] scenarios, as well as with older [[IBM PC compatible|PC]]s. Instead, the [[DR-DOS 7.07]] boot sectors resorted to self-modifying code, [[opcode]]-level programming in [[machine language]], controlled utilization of (documented) [[side effect (computer science)|side effect]]s, multi-level data/code [[instruction overlapping|overlapping]] and algorithmic [[fold (function)|fold]]ing techniques to still fit everything into a physical sector of only 512 bytes without giving up any of their extended functionality<!-- and adding some more -->.}} {{r|group="nb"|name="NB_CHOICE"|r=Later versions of DOS (since version 6.0) introduced the external [[CHOICE (DOS command)|CHOICE]] command (in [[DR-DOS]]<!-- 6.0 and higher --> also the internal command and [[CONFIG.SYS]] directive [[SWITCH (CONFIG.SYS directive)|SWITCH]]), so, for this specific example application of a menu system, it was no longer necessary to refer to self-modifying batchjobs, however for other applications it continued to be a viable solution.}} }} ==References== {{Reflist|refs= <ref name="Massalin_1992_Synthesis">{{Cite thesis |author-first1=Calton |author-last1=Pu |author-link1=Calton Pu |author-first2=Henry |author-last2=Massalin |author-link2=Henry Massalin |author-first3=John |author-last3=Ioannidis |degree=PhD |title=Synthesis: An Efficient Implementation of Fundamental Operating System Services |publisher=Department of Computer Sciences, [[Columbia University]] |location=New York, USA |id=UMI Order No. GAX92-32050 |date=1992 |url=https://www.scs.stanford.edu/nyu/04fa/sched/readings/synthesis.pdf |access-date=2023-04-25}} [https://www.cs.columbia.edu/~library/TR-repository/reports/reports-1992/cucs-039-92.ps.gz]</ref> <ref name="Henson_2008">{{cite news |title=KHB: Synthesis: An Efficient Implementation of Fundamental Operating Systems Services |author-first=Valerie |author-last=Henson |author-link=Valerie Henson |date=2008-02-20 |work=LWN.net |url=https://lwn.net/Articles/270081/ |access-date=2022-05-19 |url-status=live |archive-url=https://web.archive.org/web/20210817175159/https://lwn.net/Articles/270081/ |archive-date=2021-08-17}}</ref> <ref name="Haeberli_1994_GraficaObscura">{{cite web |author-first1=Paul |author-last1=Haeberli |author-link1=Paul Haeberli |author-first2=Bruce |author-last2=Karsh |title=Io Noi Boccioni - Background on Futurist Programming |work=Grafica Obscura |date=1994-02-03 |url=https://www.graficaobscura.com/future/index.html |access-date=2023-04-25}}</ref> <ref name="Bashe-Buchholz-Hawkins-Ingram-Rochester_1981">{{cite journal |title=The Architecture of IBM's Early Computers |author-first1=Charles J. |author-last1=Bashe |author-first2=Werner |author-last2=Buchholz |author-link2=Werner Buchholz |author-first3=George V. |author-last3=Hawkins |author-first4=J. James |author-last4=Ingram |author-first5=Nathaniel |author-last5=Rochester |journal=[[IBM Journal of Research and Development]] |issn=0018-8646 |date=September 1981 |volume=25 |issue=5 |pages=363–376 |doi=10.1147/rd.255.0363 |citeseerx=10.1.1.93.8952 |url=https://www.ece.ucdavis.edu/~vojin/CLASSES/EEC272/S2005/Papers/IBM-Architecture-Bashe_sep81.pdf |access-date=2023-04-25 |quote-page=365 |quote=The SSEC was the first operating computer capable of treating its own stored instructions exactly like data, modifying them, and acting on the result.}}</ref> <ref name="Miller_2006">{{cite web |title=Binary Code Patching: An Ancient Art Refined for the 21st Century. |author-first=Barton P. |author-last=Miller |date=2006-10-30 |publisher=[[NC State University]], Computer Science Department |series=Triangle Computer Science Distinguished Lecturer Series - Seminars 2006–2007 |url=https://arcb.csc.ncsu.edu/~mueller/seminar/fall06/miller.html |access-date=2023-04-25}}</ref> <ref name="Wenzl-Merzdovnik-Ullrich-Weippl_2019">{{cite journal |title=From hack to elaborate technique - A survey on binary rewriting |author-first1=Matthias |author-last1=Wenzl |author-first2=Georg |author-last2=Merzdovnik |author-first3=Johanna |author-last3=Ullrich |author-first4=Edgar R. |author-last4=Weippl |location=Vienna, Austria |journal=[[ACM Computing Surveys]] |volume=52 |number=3 |id=Article 49 |date=June 2019 |orig-date=<!-- accepted -->February 2019, <!-- revised -->November 2018, May 2018 |doi=10.1145/3316415 |pages=49:1–49:36 [49:1] |s2cid=195357367 |url=https://publications.sba-research.org/publications/201906%20-%20GMerzdovnik%20-%20From%20hack%20to%20elaborate%20technique.pdf |access-date=2021-11-28 |url-status=live |archive-url=https://web.archive.org/web/20210115224807/https://publications.sba-research.org/publications/201906%20-%20GMerzdovnik%20-%20From%20hack%20to%20elaborate%20technique.pdf |archive-date=2021-01-15 |quote-page=49:1 |quote=[…] Originally, [[binary rewriting]] was motivated by the need to change parts of a program during execution (e.g., run-time patching on the [[PDP-1]] in the 1960's) […]}} (36 pages)</ref> <ref name="Knuth_MMIX">{{cite web |title=MMIX 2009 - a RISC computer for the third millennium |author-first=Donald Ervin |author-last=Knuth |author-link=Donald Ervin Knuth |date=2009 |orig-date=1997 |url=https://www-cs-faculty.stanford.edu/~knuth/mmix.html |access-date=2021-11-28 |url-status=live |archive-url=https://web.archive.org/web/20211127194354/https://www-cs-faculty.stanford.edu/~knuth/mmix.html |archive-date=2021-11-27}}</ref> <ref name="Ortiz_2015">{{cite web |title=On Self-Modifying Code and the Space Shuttle OS |author-first=Carlos Enrique |author-last=Ortiz |date=2015-08-29 |orig-date=2007-08-18 |url=https://weblog.cenriqueortiz.com/computing/2007/08/18/on-self-modifying-code-and-the-space-shuttle-os/ |access-date=2023-04-25}}</ref> <ref name="MicroFocus_ALTER">{{cite book |chapter=The ALTER Statement |publisher=[[Micro Focus]] |title=COBOL Language Reference |url=https://www.microfocus.com/documentation/visual-cobol/vc80/VS2022/HRLHLHPDF803.html}}</ref> <ref name="Push">{{cite web |title=Evolutionary Computing with Push: Push, PushGP, and Pushpop |author-first=Lee |author-last=Spector |date= |publisher= |url=https://faculty.hampshire.edu/lspector/push.html |access-date=2023-04-25}}</ref> <ref name="Schmidhuber">[[Jürgen Schmidhuber]]'s publications on [https://people.idsia.ch/~juergen/metalearner.html self-modifying code for self-referential machine learning systems]</ref> <ref name="Fosdal_2001">{{cite web |title=Self-modifying Batch File |author-first=Lars |author-last=Fosdal |date=2001 |url=http://www.csd.net/~cgadd/knowbase/DOS0019.HTM |url-status=dead |archive-url=https://web.archive.org/web/20080421173331/http://www.csd.net/~cgadd/knowbase/DOS0019.HTM |archive-date=2008-04-21}}</ref> <ref name="Paul_1996">{{cite book |title=Konzepte zur Unterstützung administrativer Aufgaben in PC-Netzen und deren Realisierung für eine konkrete Novell-LAN-Umgebung unter Benutzung der Batchsprache von DOS |language=de |author-first=Matthias R. |author-last=Paul |version=3.11 |date=1996-10-13 |orig-date=1996-08-21<!-- v3.05-->, 1994 |location=Aachen, Germany |publisher=Lehrstuhl für Kommunikationsnetze ([[ComNets]]) & [[Institut für Kunststoffverarbeitung]] (IKV), RWTH |pages=51, 71–72}} (110+3 pages, diskette) (NB. Design and implementation of a centrally controlled modular distributed management system for automatic [[client (computing)|client]] configuration and [[software deployment]] with [[self-management (computer science)|self-healing]] update mechanism in [[local area network|LAN]] environments based on [[self-replication|self-replicating]] and indirectly self-modifying batchjobs with zero memory footprint instead of a need for [[terminate and stay resident|resident]] management software on the clients.)</ref> <ref name="Wilkinson_1996">{{cite web |title=The H89 Worm: Memory Testing the H89 |author-first=William "Bill" Albert |author-last=Wilkinson |date=2003 |orig-date=1996, 1984 |work=Bill Wilkinson's Heath Company Page |url=https://www.heco.wxwilki.com/h89worm.html |access-date=2021-12-13 |url-status=live |archive-url=https://web.archive.org/web/20211213130013/https://www.heco.wxwilki.com/h89worm.html |archive-date=2021-12-13 |quote=[…] Besides fetching an instruction, the [[Z80]] uses half of the cycle to [[RAM refresh|refresh]] the [[dynamic RAM]]. […] since the Z80 must spend half of each [[instruction fetch]] cycle performing other chores, it doesn't have as much time to fetch an [[instruction byte]] as it does a data byte. If one of the [[RAM chip]]s at the memory location being accessed is a little slow, the Z80 may get the wrong bit pattern when it fetches an instruction, but get the right one when it reads data. […] the built-in memory test won't catch this type of problem […] it's strictly a data read/write test. During the test, all instruction fetches are from the [[ROM]], not from RAM […] result[ing] in the [[Heath H89|H89]] passing the memory test but still operating erratically on some programs. […] This is a program that tests memory by relocating itself through RAM. As it does so, the CPU prints the current address of the program on the [[cathode-ray tube|CRT]] and then fetches the instruction at that address. If the RAM ICs are okay at that address, the CPU relocates the test program to the next memory location, prints the new address, and repeats the procedure. But, if one of the RAM ICs is slow enough to return an incorrect bit pattern, the CPU will misinterpret the instruction and behave unpredictably. However, it's likely that the display will lock up showing the address of faulty IC. This narrows the problem down eight ICs, which is an improvement over having to check as much as 32. […] The […] program will perform a worm test by pushing an RST 7 (RESTART 7) instruction from the low end of memory on up to the last working address. The rest of the program remains stationary and handles the display of the current location of the RST 7 command and its [[relocation (computing)|relocation]]. Incidentally, the program is called a [[computer worm|worm]] test because, as the RST 7 instruction moves up through memory, it leaves behind a [[NOP trail|slime trail]] of [[NOP (code)|NOP]]s (NO OPERATION). […]}}</ref> <ref name="Caldera_1997_DOSSRC">{{cite web |title=Caldera OpenDOS Machine Readable Source Kit (M.R.S) 7.01 |publisher=[[Caldera (company)|Caldera, Inc.]] |date=1997-05-01 |url=https://archive.sundby.com/retro/DR-DOS/dossrc.zip |access-date=2022-01-02 |url-status=dead |archive-url=https://web.archive.org/web/20210807095409/https://archive.sundby.com/retro/DR-DOS/dossrc.zip |archive-date=2021-08-07}} [https://web.archive.org/web/20220102102656/https://archive.sundby.com/retro/OpenDOS/OPENDOS_7.01_CODE.ZIP]</ref> <ref name="Paul_1997_OD-A3">{{cite web |author-first=Matthias R. |author-last=Paul |title=Caldera OpenDOS 7.01/7.02 Update Alpha 3 IBMBIO.COM README.TXT |url=http://www.uni-bonn.de/~uzs180/download/ibmbioa3.zip |date=1997-10-02 |access-date=2009-03-29 |url-status=dead |archive-url=https://web.archive.org/web/20031004074600/http://www-student.informatik.uni-bonn.de/~frinke/ibmbioa3.zip |archive-date=2003-10-04}} [https://web.archive.org/web/20181225154705/http://mirror.macintosharchive.org/max1zzz.co.uk/+Windows%20&%20DOS/DOS/System/Novell/Support/Bins/Op702src.zip<!-- Op702src.zip is an unofficial renamed distribution of the ibmbioa3.zip file -->]</ref> <ref name="HP9100A_1998">{{cite web |title=HP 9100A/B |date=1998 |work=MoHPC - The Museum of HP Calculators |at=Overlapped Data and Program Memory / Self-Modifying Code|url=https://www.hpmuseum.org/hp9100.htm |access-date=2023-09-23 |url-status=live |archive-url=https://web.archive.org/web/20230923125424/https://www.hpmuseum.org/hp9100.htm |archive-date=2023-09-23}}</ref> }} == Further reading== * {{cite web |title=GCR decoding on the fly |author-first=Linus |author-last=Åkesson |date=2013-03-31 |url=https://www.linusakesson.net/programming/gcr-decoding/index.php |access-date=2017-03-21 |url-status=live |archive-url=https://web.archive.org/web/20170321014657/https://www.linusakesson.net/programming/gcr-decoding/index.php |archive-date=2017-03-21}} * {{cite book |title=Eine Bibliothek für Selbstmodifikationen zur Laufzeit in Java |language=de |trans-title=A library for self-modifications at runtime in Java |author-first=Christian Felix |author-last=Bürckert |date=2012-03-20 |type=Thesis |publisher=[[Universität des Saarlandes]], Naturwissenschaftlich-Technische Fakultät I, Fachrichtung Informatik |url=https://christian.buerckert.eu/wp-content/uploads/2014/03/Bachelorarbeit.pdf |access-date=2023-08-18 |url-status=live |archive-url=https://web.archive.org/web/20230818210630/https://christian.buerckert.eu/wp-content/uploads/2014/03/Bachelorarbeit.pdf |archive-date=2023-08-18}} (80 pages) ==External links== * [https://asm.sourceforge.net/articles/smc.html Using self-modifying code under Linux] * [https://web.archive.org/web/20100717072236/http://public.carnet.hr/~jbrecak/sm.html Self-modifying C code] * [https://flint.cs.yale.edu/flint/publications/smc.html Certified Self-Modifying Code] {{DEFAULTSORT:Self-Modifying Code}} [[Category:Programming paradigms]]
Edit summary
(Briefly describe your changes)
By publishing changes, you agree to the
Terms of Use
, and you irrevocably agree to release your contribution under the
CC BY-SA 4.0 License
and the
GFDL
. You agree that a hyperlink or URL is sufficient attribution under the Creative Commons license.
Cancel
Editing help
(opens in new window)
Pages transcluded onto the current version of this page
(
help
)
:
Template:Anchor
(
edit
)
Template:Citation needed
(
edit
)
Template:Cite book
(
edit
)
Template:Cite web
(
edit
)
Template:More citations needed
(
edit
)
Template:Reflist
(
edit
)
Template:Short description
(
edit
)
Template:Use American English
(
edit
)
Template:Use dmy dates
(
edit
)
Template:Use list-defined references
(
edit
)
Template:Var
(
edit
)