Assembler operating system. Assembler School: Operating System Development

Recently I decided to learn assembler, but I was not interested in wasting lines of code. I thought that as I learn assembler, I will master some subject area. So my choice fell on writing a bootloader. The result of my findings here on this blog.

I would like to say right away that I love theory combined with practice, so let's start.

First, I'll show you how to create the simplest MBR so that we can enjoy the result in the shortest possible time. As we complicate the practical examples, I will provide theoretical information.

First, let's make a bootloader for the USB stick!

Attention!!! Our first assembler program will work both for a USB flash drive and for other devices such as Floppy - disk or Hard disk. Subsequently, in order for all examples to work correctly, I will provide a number of clarifications regarding how the code works on different devices.

We will write on Fasm, since it is he who is considered the best compiler for writing loaders, which is MBR. The second reason for choosing Fasm is that it greatly simplifies the process of compiling files. No command line directives, etc. nonsense that can completely discourage learning assembler and achieve our goals. So, at the initial stage we need two programs and some unnecessary USB flash drive of the minimum size. I dug up 1Gb at myself (it is quickly formatted, and it's not a pity, if anything). After the work of our bootloader, the flash drive will cease to function normally. my windows 7 refuses to format my stick. I advise you to return the flash drive to life with the utility HP USB Disk Storage Format Tool ( HPUSBFW.EXE) or other utilities for formatting flash drives.

We install them and throw out the appropriate shortcuts on the desktop or wherever you like.

Preparation is over, let's get down to action

Open Fasmw.exe and write the following there. We'll sketch out the bare minimum of code to see the result. Later we will analyze what is incandescent here. In short, I give comments.

FASM code: ============= boot.asm ==============

org 7C00h; the addresses of our program are calculated taking into account this directive

use16; a hexadecimal code is generated

cli; disable interrupts for changing addresses in segment registers

mov ax, 0

mov sp, 7C00h

sti; enable interrupts (after changing addresses)

mov ax, 0003h; setting video mode for displaying a line to the screen

int 10h

mov ax, 1301h; actually displaying the string function 13h int 10h (more on this later)

mov bp, stroka; address of the output line

mov dx, 0000h; line and column in which the text is displayed

mov cx, 15; number of characters of the output string

mov bx, 000eh; 00-video page number (better not to touch) 0e-character attributes (color, background)

int 10h

jmp $; mark time (loops the program at this point)

stroka db "Ok, MBR loaded!"

times 510 - ($ - $$) db 0; filling with zeros the gap between the previous byte and the last

db 0x55, 0xAA; last two bytes

Compile this code (Ctrl + F9) in fasm "e and save the resulting binary file as boot.bin somewhere convenient. Before writing our binary to a USB flash drive, a little theory.

When you plugged the USB flash drive into the computer, it is absolutely not obvious to the BIOS system that you want to boot from the USB flash drive, so in the BIOS settings you need to select the device you want to boot from. So we chose to boot from USB (you will have to figure out how to do this yourself , since the BIOS interface has different variations ... you can google the BIOS settings for your motherboard. There is nothing complicated, as a rule).

Now that the BIOS knows that you want to boot from the flash drive, it must make sure that the zero sector on the flash drive is bootable. To do this, the BIOS looks the last two bytes of the zero sector and, if they are equal to 0x55 0xAA, then only then it will be loaded into RAM. Otherwise, the BIOS will just pass by your flash drive. Having found these two magic bytes, he loads the zero sector into the RAM at the address 0000: 7C00h, and then forgets about the USB flash drive and transfers control to this address. Now all power over the computer belongs to your bootloader and it, acting already from the RAM, can load additional code from the USB flash drive. Now we will see how this very sector looks in the DMDE program.

1. Insert your USB flash drive into the computer and make sure that it does not contain the information you need.

2.Open the DMDE program. Read all further actions in the figures:

After watching this comic, you will have the skill of loading your MBR onto a USB stick. And this is how the long-awaited result of our bootloader's work looks like:

By the way, if we talk about the minimal bootloader code, then it may look like this:

Org 7C00h
jmp $
db 508 dup (0)
db 0x55,0xAA

Such a bootloader, having received control, simply hangs up the computer, executing one meaningless command jmp $ in a loop. I call her marking time.

I posted a video on YouTube that may help you:

Finally, a few quick facts about the bootloader:

1. Bootloader, also known as bootloader, also known as MBR, is 512 bytes in size. Historically,
that this condition must be met to support older media and devices.
2. The loader is always located in the zero sector of a flash drive, floppy disk, hard disk, from the point of view of the DMDE program or other hex editors that allow you to work with devices. To download the binary (our boot.bin) to one of the listed devices, we do not need to think about their internal physical structure. The DMDE program just knows how to read sectors on these devices and displays them in LBA mode (it just numbers them from 0 to the last sector). You can read about LBA
3. The bootloader must always end with two bytes 0x55 0xAA.
4. The loader is always loaded into memory at the address 0000: 7C00h.
5. The operating system begins with the bootloader.

I say right away, do not close the article with the thoughts "Damn, another Popov." He has just a licked Ubuntu, and I have everything from scratch, including the kernel and applications. So, the continuation under the cut.

OS group: here.
I'll give you one screenshot first.

There are no more of them, and now more in detail about why I am writing it.

It was a warm April evening, Thursday. Since childhood, I dreamed of writing an OS, when I suddenly thought: "Now I know the pros and cons, why not make my dream come true?" I googled sites on this topic and found an article from Habr: "How to start and not quit writing OS". Thanks to the author for the link to the OSDev Wiki below. I went there and started work. There was in one article all the data on the minimum OS. I started building cross-gcc and binutils and then rewrote everything from there. You should have seen my joy when I saw the inscription "Hello, kernel World!" I jumped right from the chair and realized - I will not give up. I wrote "console" (in quotes, I didn't have access to the keyboard), but then I decided to write a window system. As a result, it worked, but I had no access to the keyboard. And then I decided to come up with a name based on the X Window System. Googled Y Window System - it is. As a result, I named it Z Window System 0.1, included in OS365 pre-alpha 0.1. And yes, no one saw her except myself. Then I figured out how to implement keyboard support. Screenshot of the very first version, when there was still nothing, not even the window system:

The text cursor did not even move in it, as you can see. Then I wrote a couple of simple Z-based applications. And here comes the 1.0.0 alpha release. There were a lot of things, even the system menu. And the file manager and calculator just didn't work.

I was directly terrorized by a friend who cares about beauty alone (Mitrofan, sorry). He said “We got it down to VBE-mode 1024 * 768 * 32, we got it down, we got it down! Well, drink it down! " Well, I was already tired of listening to him, and still washed it down. About the implementation below.

I made everything my bootloader, namely GRUB. With it you can set the graphical mode without complications by adding a few magic lines to the Multiboot header.

Set ALIGN, 1<<0 .set MEMINFO, 1<<1 .set GRAPH, 1<<2 .set FLAGS, ALIGN | MEMINFO | GRAPH .set MAGIC, 0x1BADB002 .set CHECKSUM, -(MAGIC + FLAGS) .align 4 .long MAGIC .long FLAGS .long CHECKSUM .long 0, 0, 0, 0, 0 .long 0 # 0 = set graphics mode .long 1024, 768, 32 # Width, height, depth
And then from the Multiboot information structure I take the framebuffer address and screen resolution and write pixels there. VESA did everything very confused - RGB colors must be entered in the reverse order (not R G B, but B G R). For several days I did not understand - why the pixels are not displayed !? As a result, I realized that I forgot to change the values of 16 color constants from 0 ... 15 to their RGB equivalents. As a result, I released, at the same time washed down the gradient background. Then I made a console, 2 applications and released 1.2. Oh yes, I almost forgot - you can download the OS at

Original: AsmSchool: Make an operating system
Author: Mike Saunders
Published Date: April 15, 2016
Translation: A. Panin
Date of transfer: April 16, 2016

Part 4: With the skills you've learned from previous articles in the series, you can start developing your own operating system!

What is it for?

To understand how compilers work.
To understand the instructions of the CPU.
To optimize your code for performance.

Over the course of several months, we have gone through a difficult path, which began with the development of simple programs in assembly language for Linux and ended in the last article in the series with the development of self-sufficient code that runs on a personal computer without an operating system. Well, now we will try to collect all the information together and create a real operating system. Yes, we will follow in the footsteps of Linus Torvalds, but first it is worth answering the following questions: "What is an operating system? Which of its functions will we have to recreate?"

In this article, we will focus only on the main functions of the operating system: loading and executing programs. Complex operating systems perform many more functions, such as managing virtual memory and processing network packets, but they take years of continuous operation to implement correctly, so in this article we will only cover the basic functions that are present in any operating system. Last month we developed a small program that fit in a 512-byte sector of a floppy disk (its first sector), and now we will slightly modify it in order to add the function of loading additional data from the disk.

Boot Loader Development

We could try to reduce the size of the binary code of our operating system as much as possible in order to place it in the first 512-byte sector of a floppy disk, the same one that is loaded by BIOS, but in this case we will not be able to implement any interesting functions. Therefore, we will use these 512 bytes to house the binary code of a simple boot loader that will load the binary code of the OS kernel into RAM and execute it. (After that, we will develop the OS kernel itself, which will load the binary code of other programs from disk and also execute it, but this will be discussed a little later.)

You can download the source code for the examples discussed in this article at www.linuxvoice.com/code/lv015/asmschool.zip. And this is our bootloader code from a file named boot.asm:

BITS 16 jmp short start; Jump to label skipping disk description nop; Addition before disk description% include "bpb.asm" start: mov ax, 07C0h; Load address mov ds, ax; Data segment mov ax, 9000h; Prepare the stack mov ss, ax mov sp, 0FFFFh; The stack grows down! cld; Setting the direction flag mov si, kern_filename call load_file jmp 2000h: 0000h; Jump to the OS kernel binary code loaded from the file kern_filename db "MYKERNELBIN"% include "disk.asm" times 510 - ($ - $$) db 0; Completion of the binary code with zeros up to 510 bytes dw 0AA55h; Bootloader binary code ending label buffer:; Start of buffer for disk contents

In this code, the first CPU instruction is the jmp instruction, which is located after the BITS directive telling the NASM assembler to use 16-bit mode. As you probably remember from the previous article in the series, the execution of a 512-byte binary code loaded by means of BIOS from disk starts from the very beginning, but we have to go to the label to skip a special set of data. Obviously, last month we just wrote the code to the beginning of the disk (using the dd utility) and left the rest of the disk space empty.

Now we have to use a floppy disk with a suitable MS-DOS file system (FAT12), and in order to work correctly with this file system, we need to add a set of special data near the beginning of the sector. This set is called the BIOS Parameter Block (BPB) and contains information such as the disk label, the number of sectors, and so on. It should not interest us at this stage, since more than one series of articles can be devoted to such topics, which is why we have placed all related instructions and data in a separate source code file named bpb.asm.

Based on the above, this directive from our code is extremely important:

% include "bpb.asm"

This is a NASM directive that allows the contents of a specified source file to be included in the current source file during assembly. Thus, we can make the code of our bootloader as short and understandable as possible, bringing all the details of the implementation of the BIOS parameter block into a separate file. The BIOS parameter block should be located three bytes after the beginning of the sector, and since the jmp instruction takes only two bytes, we have to use the nop instruction (its name stands for "no operation" - this is an instruction that does nothing except waste CPU cycles ) to fill in the remaining byte.

Working with the stack

Next, we will have to use instructions similar to those discussed in the previous article to prepare registers and the stack, as well as the cld instruction (stands for "clear direction"), which allows you to set the direction flag for certain instructions, such as the lodsb instruction, which after its execution will be increase the value in the SI register, not decrease it.

After that, we put the address of the string in the SI register and call our load_file function. But think for a minute - we haven't developed this feature yet! Yes, it's true, but its implementation can be found in another source file we include, called disk.asm.

The FAT12 file system used on floppy disks that are formatted in MS-DOS is one of the simplest file systems in existence, but it also requires a lot of code to work with its contents. The load_file subroutine is about 200 lines long and will not be presented in this article, since we are considering the process of developing an operating system, and not a driver for a specific file system, therefore, it is not very reasonable to waste space on the log pages in this way. In general, we included the disk.asm source file almost before the end of the current source file and can forget about it. (If you are interested in the structure of the FAT12 file system, you can check out the excellent overview at http://tinyurl.com/fat12spec, then look into the disk.asm source file - the code contained in it is well commented .)

In any case, the load_file subroutine loads the binary code from the file with the name specified in the SI register into segment 2000 with an offset of 0, after which we jump to its beginning for execution. And that's all - the operating system kernel is loaded and the boot loader has done its job!

You may have noticed that our code uses MYKERNELBIN as the name of the operating system kernel file instead of MYKERNEL.BIN, which fits well with the 8 + 3 naming scheme used on DOS floppy disks. In fact, the FAT12 file system uses an internal representation of file names, and we save space by using a file name that is guaranteed not to require the implementation of a mechanism within our load_file subroutine to find the dot character and convert the file name to the internal file system representation.

After the line with the directive to connect the source code file disk.asm, there are two lines intended to pad the binary code of the boot loader with zeros to 512 bytes and include the end mark of its binary code (this was discussed in the last article). Finally, at the very end of the code is the "buffer" label, which is used by the load_file subroutine. In general, the load_file subroutine requires free space in RAM to perform some intermediate steps in the process of searching for a file on disk, and we have enough free space after loading the bootloader, so we place the buffer here.

To assemble the boot loader, use the following command:

Nasm -f bin -o boot.bin boot.asm

Now we need to create a virtual floppy disk image in MS-DOS format and add the binary code of our bootloader to its first 512 bytes using the following commands:

Mkdosfs -C floppy.img 1440 dd conv = notrunc if = boot.bin of = floppy.img

This completes the bootloader development process! We now have a boot floppy disk image that allows us to load the operating system kernel binary from a file named mykernel.bin and execute it. Further, a more interesting part of the work awaits us - the development of the operating system kernel itself

Operating system kernel

We want our operating system kernel to perform many important tasks: displaying a greeting, accepting input from the user, determining if the input is a supported command, and executing programs from disk after the user specifies their names. This is the operating system kernel code from the mykernel.asm file:

Mov ax, 2000h mov ds, ax mov es, ax loop: mov si, prompt call lib_print_string mov si, user_input call lib_input_string cmp byte, 0 je loop cmp word, "ls" je list_files mov ax, si mov cx, 32768 call lib_load_file jc load_fail call 32768 jmp loop load_fail: mov si, load_fail_msg call lib_print_string jmp loop list_files: mov si, file_list call lib_get_file_list call lib_print_string jmp loop prompt db 13, 10, "MyOS>", 0 load_fail_msg "Not db 13, 10 ", 0 user_input times 256 db 0 file_list times 1024 db 0% include" lib.asm "

Before looking at the code, you should pay attention to the last line with the directive to include the source code file lib.asm, which is also located in the asmschool.zip archive from our website. This is a library of useful routines for working with the screen, keyboard, strings and disks, which you can also use - in this case, we include this source code file at the very end of the main source file of the operating system kernel in order to make the latter as compact and beautiful as possible ... See the section "Lib.asm Routines" for more information on all of the available routines.

In the first three lines of the operating system kernel code, we fill the segment registers with data to point to segment 2000, into which the binary code was loaded. This is important to ensure that instructions such as lodsb work correctly, which should read data from the current segment and not from any other segment. After that, we will not perform any additional operations with segments; our operating system will run with 64 KB of RAM!

Further in the code there is a label corresponding to the beginning of the loop. First of all, we use one of the lib.asm library routines, lib_print_string, to print the greeting. Bytes 13 and 10 before the greeting line are newline characters, due to which the greeting will not be displayed immediately after the output of any program, but always on a new line.

We then use another lib.asm subroutine called lib_input_string, which takes the user typed characters and stores them in a buffer pointed to in the SI register. In our case, the buffer is declared near the end of the operating system kernel code as follows:

User_input times 256 db 0

This declaration allows you to create a 256-character buffer filled with zeros — it must be large enough to store commands for a simple operating system like ours!

Next, we validate user input. If the first byte of the user_input buffer is zero, then the user simply pressed the Enter key without entering any command; remember that all strings are null-terminated. So in this case we should just go to the beginning of the loop and print the greeting again. However, in the event that the user enters any command, we will have to first check if he has entered the ls command. Until now, you could only observe comparisons of individual bytes in our assembly language programs, but do not forget that it is also possible to perform comparisons of two-byte values or machine words. In this code, we compare the first machine word from the user_input buffer with the machine word corresponding to the ls line and, if they are identical, move to the code block below. Within this block of code, we use another subroutine from lib.asm to get a comma-separated list of files on disk (which must be stored in the file_list buffer), display that list, and move back to the loop to process user input.

Execution of third-party programs

If the user does not enter the ls command, we are assuming they entered the name of the program from disk, so it makes sense to try to load it. Our lib.asm library contains an implementation of the useful subroutine lib_load_file, which parses the tables of the FAT12 file system on the disk: it takes a pointer to the beginning of the line with the file name using the AX register, and also an offset value for loading the binary code from the program file using the CX register. We are already using the SI register to store a pointer to the user input string, so we copy that pointer into the AX register, and then put the value 32768, used as an offset for loading the binary from the program file, into the CX register.

But why do we use this particular value as an offset for loading the binary code from the program file? Well, this is just one of the memory map options for our operating system. Due to the fact that we are working in one 64 KB segment, and the binary code of our kernel is loaded at offset 0, we have to use the first 32 KB of memory for the kernel data, and the remaining 32 KB for the data of the loaded programs. Thus, offset 32768 is the middle of our segment and allows us to provide sufficient RAM for both the operating system kernel and the loaded programs.

After that, the lib_load_file subroutine performs an extremely important operation: if it cannot find a file with the given name on the disk, or for some reason cannot read it from disk, it simply exits and sets a special carry flag. This is a flag of the state of the central processor, which is set in the process of performing some mathematical operations and at the moment should not interest us, but at the same time we can determine the presence of this flag for making quick decisions. If lib_load_asm sets the carry flag, we use the jc instruction (jump if carry) to jump to the block of code that outputs the error message and returns to the beginning of the user input loop.

In the same case, if the carry flag is not set, we can conclude that the lib_load_asm subroutine has successfully loaded the binary code from the program file into RAM at address 32768. All we need in this case is to initiate the execution of the binary code loaded at this address , that is, start execution of the program specified by the user! And after this program uses the ret instruction (to return to the calling code), we just need to return to the user input processing loop. Thus, we have created an operating system: it consists of the simplest mechanisms for parsing commands and loading programs, implemented in about 40 lines of assembly code, albeit with a lot of help from the subroutines from the lib.asm library.

To assemble the operating system kernel code, use the following command:

Nasm -f bin -o mykernel.bin mykernel.asm

After that, we will have to somehow add the mykernel.bin file to the floppy disk image file. If you are familiar with mounting disk images using loopback devices, you can access the contents of the floppy.img disk image using it, but there is an easier way, which is to use the GNU Mtools toolkit (www.gnu.org/software / mtools). This is a set of programs for working with floppy disks that use MS-DOS / FAT12 file systems, available from the software package repositories of all popular Linux distributions, so you just have to use apt-get, yum, pacman or any other utility. used to install software packages in your distribution.

After installing the appropriate software package, you will need to run the following command to add the mykernel.bin file to the floppy.img disk image file:

Mcopy -i floppy.img mykernel.bin :: /

Notice the funny characters at the end of the command: colon, colon and slash. We're almost ready to launch our operating system now, but what's the point when there are no applications for it yet? Let's fix this misunderstanding by developing an extremely simple application. Yes, now you will be developing an application for your own operating system - just imagine how much your authority will rise in the ranks of the geeks. Save the following code in a file called test.asm:

Org 32768 mov ah, 0Eh mov al, "X" int 10h ret

This code simply uses the BIOS function to display the "X" character on the screen, and then returns control to the code that called it - in our case, this code is the operating system code. The org line, with which the application source code begins, is not an instruction from the central processor, but a directive of the NASM assembler telling it that the binary code will be loaded into RAM at offset 32768, therefore, it is necessary to recalculate all offsets taking this circumstance into account.

This code also needs to be assembled, and the resulting binary file needs to be added to the floppy disk image file:

Nasm -f bin -o test.bin test.asm mcopy -i floppy.img test.bin :: /

Now take a deep breath, get ready to contemplate the unsurpassed results of your own work, and load the floppy disk image using a PC emulator such as Qemu or VirtualBox. For example, the following command can be used for this purpose:

Qemu-system-i386 -fda floppy.img

Voila: the boot.img boot loader, which we integrated into the first sector of the disk image, loads the operating system kernel mykernel.bin, which displays a welcome message. Enter the ls command to get the names of two files located on disk (mykernel.bin and test.bin), and then enter the name of the last file to execute it and display the X character on the screen.

It's cool, isn't it? Now you can start tweaking your operating system's shell, adding new command implementations, and adding additional program files to disk. If you want to run this operating system on a real PC, you should refer to the section "Running the Boot Loader on a Real Hardware Platform" from the previous article in this series - you will need exactly the same commands. Next month, we will make our operating system more powerful by allowing downloadable programs to use system functions and thus implementing the concept of code sharing to reduce duplication. Most of the work is still ahead.

Lib.asm library routines

As mentioned earlier, lib.asm provides a large set of useful routines for use within your operating system kernels and individual programs. Some of them use instructions and concepts that have not yet been touched on in the articles of this series, others (such as routines for working with disks) are closely related to the peculiarities of the structure of file systems, but if you consider yourself competent in these matters, you can read it yourself. with their implementations and understand how they work. That being said, it's more important to figure out how to call them from your own code:

lib_print_string - Takes a pointer to a null-terminated string through the SI register and prints that string to the screen.
lib_input_string - Accepts a pointer to a buffer through the SI register and fills this buffer with characters entered by the user using the keyboard. After the user presses the Enter key, the line in the buffer is null terminated and control returns to the calling program code.
lib_move_cursor - Moves the cursor on the screen to the position with coordinates passed through the DH (line number) and DL (column number) registers.
lib_get_cursor_pos - this subroutine should be called to get the numbers of the current row and column through the DH and DL registers, respectively.
lib_string_uppercase - Takes a pointer to the beginning of a null-terminated string through the AX register and converts the characters in the string to uppercase.
lib_string_length - Takes a pointer to the beginning of a null-terminated string via the AX register and returns its length via the AX register.
lib_string_compare - Takes pointers to the beginning of two null-terminated strings through the SI and DI registers and compares the strings. Sets the carry flag if the lines are identical (to use a branch instruction depending on the jc carry flag) or clears this flag if the lines are different (to use the jnc instruction).
lib_get_file_list - Takes a pointer to the beginning of the buffer through the SI register and places a null-terminated string in that buffer containing a comma-separated list of file names from disk.
lib_load_file - takes a pointer to the beginning of the line containing the file name through the AX register and loads the contents of the file at the offset passed through the CX register. Returns the number of bytes copied into memory (that is, the size of the file) using the BX register, or sets the carry flag if a file with the given name is not found.

Today there is a curious example in our Cabinet of Curiosities - an operating system written in pure assembler. Together with drivers, a graphical shell, dozens of pre-installed programs and games, it takes less than one and a half megabytes. Meet the exceptionally fast and predominantly Russian OS "Kolibri".

The development of "Kolibri" went pretty quickly until 2009. The bird has learned to fly on different hardware, requiring the minimum of the first Pentium and eight megabytes of RAM. The minimum system requirements for Hummingbird are:

CPU: Pentium, AMD 5x86 or Cyrix 5x86 without MMX @ 100 MHz;
RAM: 8 MB;
video card: VESA-compatible with support for VGA mode (640 × 480 × 16).

Modern "Hummingbird" is a regularly updated "night build" of the latest official version, released at the end of 2009. We tested build 0.7.7.0+ from August 20, 2017.

WARNING

In the default settings, KolibriOS does not have access to disks that are visible through the BIOS. Think carefully and make a backup before changing this setting.

Changes in nightly builds, although small, have accumulated enough over the years. The updated "Kolibri" can write to FAT16–32 / ext2 - ext4 partitions and supports other popular file systems (NTFS, XFS, ISO-9660) in read mode. It added support for USB and network cards, added a TCP / IP stack and audio codecs. In general, you can already do something in it, and not just look once at an ultralight operating system with a GUI and be impressed with the launch speed.

Like the previous versions, the latest "Kolibri" is written in flat assembler (FASM) and occupies one floppy disk - 1.44 MB. Thanks to this, it can be entirely located in some kind of specialized memory. For example, the craftsmen wrote KolibriOS directly into the Flash BIOS. During operation, it can be entirely located in the cache of some processors. Just imagine: all operating systems, along with programs and drivers, are cached!

INFO

When you visit the kolibrios.org website, your browser can warn you of the danger. The reason seems to be the assembler programs in the distribution. VirusTotal now defines the site as completely safe.

"Kolibri" can be easily loaded from a floppy disk, hard drive, flash drive, Live CD or in a virtual machine. For emulation, it is enough to specify the type of OS "other", allocate one processor core and some RAM to it. It is not necessary to connect the disk, and if you have a router with DHCP, the "Kolibri" will instantly connect to the Internet and the local network. Immediately upon loading, you will see a corresponding notification.

One problem is that the HTTPS protocol is not supported by the built-in Kolibri browser. Therefore, it was not possible to look at the site in it, as well as to open the pages of Google, Yandex, Wikipedia, "Sberbank" ... in fact, no familiar address. Everyone switched to a secure protocol a long time ago. The only site with old-school pure HTTP that I came across is the "portal of the Russian Government", but it did not look the best in a text browser either.

The appearance settings in Hummingbird have improved over the years, but are still far from ideal. The list of supported video modes is displayed on the Kolibri loading screen by pressing the key with the Latin letter a.

The list of available options is limited, and the required resolution may not appear. If you have a video card with AMD (ATI) GPU, you can immediately add custom settings. To do this, you need to pass the -m parameter to the ATIKMS loader x x , for example:

/ RD / 1 / DRIVERS / ATIKMS -m1280x800x60 -1

Here / RD / 1 / DRIVERS / ATIKMS is the path to the bootloader (RD - RAM Disk).

When the system is running, the selected video mode can be viewed with the vmode command and (theoretically) switched manually. If "Kolibri" is running in the virtual machine, then this window will remain empty, but with a clean boot, Intel video drivers can be added from i915 to Skylake inclusive.

Surprisingly, a bunch of games fit into KolibriOS. Among them there are logic and arcade games, tag, snake, tanks (no, not WoT) - a whole "Game Center"! Even Doom and Quake were ported to Hummingbird.

Another important thing was the FB2READ reader. It works correctly with Cyrillic and has text display settings.

I recommend storing all user files on a USB flash drive, but it must be connected via a USB 2.0 port. Our USB 3.0 flash drive (in the USB 2.0 port) with a capacity of 16 GB with the NTFS file system was identified immediately. If you need to write files, then you should connect a USB flash drive with a FAT32 partition.

The Kolibri distribution includes three file managers, utilities for viewing images and documents, audio and video players, and other custom applications. However, it focuses on assembly language development.

The built-in text editor has ASM syntax highlighting and even allows you to immediately run typed programs.

Among the development tools is the Oberon-07/11 compiler for i386 Windows, Linux and KolibriOS, as well as low-level emulators: E80 - ZX Spectrum emulator, FCE Ultra - one of the best NES emulators, DOSBox v.0.74 and others. All of them were specially ported to the Hummingbird.

If you leave KolibriOS for a few minutes, the screensaver will start. Lines of code will run on the screen, in which you can see a reference to MenuetOS.

Continuation is available only to participants

Option 1. Join the "site" community to read all the materials on the site

Membership in the community within the specified period will open you access to ALL Hacker materials, increase your personal cumulative discount and allow you to accumulate a professional Xakep Score!

Assembler

Assembler(from English. assemble - to assemble) - a compiler from assembly language into machine language commands.
There is an assembler for each processor architecture and for each OS or OS family. There are also so-called "cross-assemblers" that allow, on machines with the same architecture (or in the environment of one OS), to assemble programs for another target architecture or another OS, and obtain executable code in a format suitable for execution on the target architecture or in the target environment. OS.

X86 architecture

Assemblers for DOS

The most famous assemblers for the DOS operating system are the Borland Turbo Assembler (TASM) and the Microsoft Macro Assembler (MASM). The simple A86 assembler was also popular at one time.
Initially, they only supported 16-bit instructions (before the Intel 80386 processor). Later versions of TASM and MASM support both 32-bit instructions, as well as all instructions introduced in more modern processors and architecture-specific instruction sets (such as MMX, SSE, 3DNow! Etc.) ...

Microsoft Windows

With the advent of the Microsoft Windows operating system, the TASM extension, called TASM32, appeared, which made it possible to create programs to run in the Windows environment. The latest known version of Tasm is 5.3, which supports MMX instructions, and is currently included in Turbo C ++ Explorer. But officially the development of the program is completely stopped.
Microsoft maintains its product called the Microsoft Macro Assembler. It continues to evolve to this day, with the latest versions included in the DDKs. But the version of the program aimed at creating programs for DOS is not being developed. In addition, Stephen Hutchesson has created a MASM programming package called "MASM32".

GNU and GNU / Linux

The GNU operating system includes the gcc compiler, which includes the gas assembler (GNU Assembler) using AT&T syntax, unlike most other popular assemblers that use Intel syntax.

Portable assemblers

There is also an open source assembler project, versions of which are available for various operating systems, and which allows you to get object files for these systems. This assembler is called NASM (Netwide Assembler).
YASM is a BSD licensed version of NASM rewritten from scratch (with some exceptions).
FASM (Flat Assembler) is a young assembler under a BSD license modified to prohibit relicensing (including under the GNU GPL). There are versions for KolibriOS, GNU / Linux, MS-DOS and Microsoft Windows, uses Intel syntax and supports AMD64 instructions.

RISC architectures

MCS-51
AVR
At the moment there are 2 compilers from Atmel (AVRStudio 3 and AVRStudio4). The second version is an attempt to fix the not very successful first one. There is also an assembler in WinAVR.
ARM
AVR32
MSP430
PowerPC

Assembling and compiling

The process of translating an assembly language program into object code is commonly called assembly. Unlike compiling, assembly is more or less unambiguous and reversible. In assembly language, each mnemonic corresponds to one machine instruction, while in high-level programming languages, a large number of different instructions can be hidden behind each expression. In principle, this division is rather arbitrary, so sometimes translation of assembly programs is also called compilation.

Assembly language

Assembly language- a type of low-level programming language, which is a format for recording machine instructions that is human-readable. Often, for brevity, it is simply called an assembler, which is not true.

Assembly language instructions correspond one-to-one with processor instructions and, in fact, represent a convenient symbolic notation (mnemonic code) of commands and their arguments. Also, the assembly language provides basic program abstractions: linking parts of the program and data through labels with symbolic names (during assembly, an address is calculated for each label, after which each occurrence of the label is replaced with this address) and directives.
Assembler directives allow you to include data blocks (described explicitly or read from a file) into the program; repeat a certain fragment a specified number of times; compile a fragment conditionally; set the execution address of the fragment, different from the address of the location in memory [specify!]; change the values of labels during compilation; use macros with parameters, etc.
Each processor model, in principle, has its own set of instructions and the corresponding assembly language (or dialect).

Advantages and disadvantages

The virtues of assembly language

The minimum amount of redundant code, that is, the use of fewer instructions and memory accesses, allows you to increase the speed and reduce the size of the program.
Ensuring full compatibility and maximizing the capabilities of the desired platform: using special instructions and technical features of this platform.
When programming in assembly language, special features become available: direct access to hardware, I / O ports and special registers of the processor, as well as the ability to write self-modifying code (that is, metaprogramming, and without the need for a software interpreter).
The latest security technologies implemented in operating systems do not allow making self-modifying code, since they exclude the simultaneous execution of instructions and writing in the same memory area (W ^ X technology in BSD systems, DEP in Windows).

Disadvantages of assembly language

Large amounts of code and a large number of additional small tasks, which leads to the fact that the code becomes very difficult to read and understand, and therefore more difficult to debug and refine the program, as well as the difficulty of implementing programming paradigms and any other conventions. which leads to the complexity of collaborative development.
Fewer available libraries, their low compatibility with each other.
Non-portability to other platforms (except binary compatible).

Application

Directly follows from the advantages and disadvantages.
Since it is extremely inconvenient to write large programs in assembly language, they are written in high-level languages. In assembly language, they write small fragments or modules for which they are critically important:
performance (drivers);
code size (boot sectors, software for microcontrollers and processors with limited resources, viruses, software protections);
special features: work directly with hardware or machine code, that is, operating system loaders, drivers, viruses, protection systems.

Linking assembly code to other languages

Since most of the time only program fragments are written in assembly, they need to be linked with the rest of the code in other languages. This is achieved in 2 main ways:
At compile time- insertion of inline assembler fragments into the program by special language directives, including writing procedures in assembly language. The method is convenient for simple data transformations, but full-fledged assembly code, with data and subroutines, including subroutines with many inputs and outputs that are not supported by high-level languages, cannot be done with it.
At the build stage, or separate compilation. For the interaction of the linked modules, it is sufficient that the binding functions support the required calling conventions and data types. Individual modules can be written in any languages, including assembler.

Syntax

There is no generally accepted standard for the syntax of assembly languages. However, there are standards that most assembly language developers adhere to. The main such standards are Intel syntax and AT&T syntax.

Instructions

The general format for writing instructions is the same for both standards:

[label:] opcode [operands] [; comment]

where the opcode is the actual mnemonic of the instruction to the processor. Prefixes (repetitions, changing the type of addressing, etc.) can be added to it.
The operands can be constants, register names, addresses in RAM, etc. The differences between Intel and AT&T standards relate mainly to the order of enumeration of operands and their syntax for different addressing methods.
The mnemonics used are usually the same for all processors of the same architecture or family of architectures (among the widely known ones are the mnemonics of Motorola, ARM, x86 processors and controllers). They are described in the processor specification. Possible exceptions:
If the assembler uses cross-platform AT&T syntax (original mnemonics are coerced to AT&T syntax)
If initially there were two standards for recording mnemonics (the command system was inherited from a processor of another manufacturer).
For example, the Zilog Z80 processor inherited the Intel i8080 instruction set, extended it and changed the mnemonics (and register designations) in its own way. For example, I changed Intel's mov to ld. The Motorola Fireball processors inherited the Z80 instruction set, slightly cutting it down. At the same time, Motorola has officially returned to Intel mnemonics. And at the moment, half of the Fireball assemblers work with Intel mnemonics, and half with Zilog mnemonics.

Directives

In addition to instructions, a program can contain directives: commands that do not translate directly into machine instructions, but control the operation of the compiler. Their set and syntax vary significantly and depend not on the hardware platform, but on the compiler used (giving rise to dialects of languages within the same family of architectures). As a "gentleman's set" of directives, one can single out:
data definition (constants and variables)
managing the organization of the program in memory and the parameters of the output file
setting the compiler mode
all kinds of abstractions (i.e. elements of high-level languages) - from the design of procedures and functions (to simplify the implementation of the procedural programming paradigm) to conditional constructs and loops (for the structured programming paradigm)
macros

Sample program

An example Hello world program for MS-DOS for x86 architecture in the TASM dialect:

.MODEL TINY CODE SEGMENT ASSUME CS: CODE, DS: CODE ORG 100h START: mov ah, 9 mov dx, OFFSET Msg int 21h int 20h Msg DB "Hello World", 13,10, "$" CODE ENDS END START

Origin and criticism of the term "assembly language"

This type of languages got its name from the name of the translator (compiler) from these languages - assembler (English assembler). The name of the latter is due to the fact that on the first computers there were no higher-level languages, and the only alternative to creating programs using assembler was programming directly in codes.
The assembly language in Russian is often called "assembler" (and something related to it is "assembler"), which, according to the English translation of the word, is wrong, but fits into the rules of the Russian language. However, the assembler (program) itself is also called simply "assembler", and not "assembly language compiler", etc.
The use of the term "assembly language" can also lead to the misconception that there is a single low-level language, or at least a standard for such languages. When naming the language in which a specific program is written, it is advisable to specify for what architecture it is intended and in what dialect of the language it is written.