Linux Kernel speculation

- N
- Nobby Anderson
  
  Contact options for registered users
posted
14 years ago

Thu, May 21, 2009 5:36 PM

We have a number of systems based on the ppc core in a Virtex 4 FX12. The system has 4MB flash RAM for program storage and 32MB RAM and uses one of the EDK-supplied Ethernet cores for comms (temac).

Some time ago I ported a 2.6.15 kernel onto it and we've been running with that ever since. The system boots with a small bootloader which loads uboot which in turn loads the Linux kernel. The kernel has an initrd and boots into a ram filesystem that's loaded off a compressed image on the flash.

The total boot time is of the order of 35 seconds - that's from a cold start to the application running with tcp sockets open to receive stuff from the LAN. I'd like to explore ways of making that faster.

It seems that the biggest pauses in the boot process are to do with the initrd loading, then the flash RAM being decompressed into RAM for the ramdisk, and then there's a long pause while the Etherent hardware initialises. There's also about 5 seconds lost at the start while the system looks for keypresses to activate the two boot loaders' flash-ram loading tools, and I'm going to replace both of those to look for a jumper setting on the board.

What's the most effective route to further reducing the boot time? For example, is there anything I can do to get rid of the initrd, or to speed up loading a flash disk into RAM? Is there any benefit to going for a later kernel - later desktop kernels seems to bood a lot faster than older ones on the same hardware, but I'm unsure how much of that is to do with more efficient driver and other module loading and how muchg is to do with the basic kernel itself. I've already removed everything from the kernel that's not needed and everything that it needs is compiled in, I believe.

I'm really looking for ideas for where to look next! Well, that and to get some feeling for if I'm going to be able to get a worthwile speed return - like all these things there's a tradeoff between effort and worth.

Thanks in advance, Nobby

- F
- Frank Buss
  
  Contact options for registered users
Vote on answer
posted
14 years ago

Thu, May 21, 2009 6:15 PM

35 seconds is long. You can reduce it to below 2 seconds:

formatting link

Looks like they use a proprietary boot loader, by maybe their modified busybox helps you a bit:

formatting link

There are lots of other projects when searching with Google for faster Linux booting. Some time ago I've read about an init.d replacements named Upstart:

formatting link

I don't know if it is good, but the description looks good.

If you don't need a full init system, you can write your own shell script and let it execute as the init process, which starts only the required daemons, initializes your network etc., maybe all in background, and then your applications. I've done this for a product, much simpler than a full init.d system for small embedded systems, which don't need different run levels.

--
Frank Buss, fb@frank-buss.de
http://www.frank-buss.de, http://www.it4-systems.de

- A
- AZ Nomad
  
  Contact options for registered users
Vote on answer
posted
14 years ago

Thu, May 21, 2009 6:24 PM

Do you have unnecessary devices compiled in your kernel or as modules in your initrd?

I have a diskless multimedia computer that boots off a LAN and it takes

- N
- Nobby Anderson
  
  Contact options for registered users
Vote on answer
posted
14 years ago

Thu, May 21, 2009 9:41 PM

Thanks for the info, I'd love a 2 second boot time!

None of the existing time is taken in init, the init script here just configures the Etherent and starts the application. It's all in the kernel/ramdisk loading. I'm not sure how I could speed up the loading process - I suspect the ARM stuff in the links above doesn't use compressed disk and kernel images for a start, which would help, but I only have 4MB flash so I need to do that.

Nobby

- N
- Nobby Anderson
  
  Contact options for registered users
Vote on answer
posted
14 years ago

Thu, May 21, 2009 9:43 PM

Yes, everything is compiled in. The two main pauses are loading the initrd and then uncompressing the system ram disk. Sadly seeing as I only have 4MB flahs I have to compress everything including the kernel image.

Nobby

- F
- Frank Buss
  
  Contact options for registered users
Vote on answer
posted
14 years ago

Thu, May 21, 2009 10:27 PM

Depends on your CPU, but loading the ramdisk and decompressing should be fast, only some seconds. Maybe you can put some serial port debug outputs in the kernel with timestamps, to see where most of the time is wasted?

--
Frank Buss, fb@frank-buss.de
http://www.frank-buss.de, http://www.it4-systems.de

- J
- Jacko
  
  Contact options for registered users
Vote on answer
posted
14 years ago

Thu, May 21, 2009 11:16 PM

start

the

ed

pt

en

ll

n

ly

Maybe use threaded code as a form of compression. Work out how long a subroutine call return takes and then search for any code sequence used more than n times where length makes for a compression recursively, with longest common code first. Any flashy register renaming in the compiled code will make for less compression of the code this way, and grater cache usage if any cache is present.

cheers jacko

- N
- Nobody
  
  Contact options for registered users
Vote on answer
posted
14 years ago

Fri, May 22, 2009 1:36 AM

Then you shouldn't need an initrd. The rationale behind initrd is so that you can load modules which are needed for mounting the root filesystem (e.g. IDE/SCSI/RAID/USB drivers, filesystem modules).

For an embedded system, you can just build a kernel with all of the relevant drivers built in.

Also, filesystems such as cramfs and squashfs decompress data on-demand, rather than decompressing the entire image into RAM. This reduces the start-up time and memory usage, but results in slower access.

- N
- Nobby Anderson
  
  Contact options for registered users
Vote on answer
posted
14 years ago

Fri, May 22, 2009 8:02 AM

Sorry, I had confused myself. There isn't an inird, that was in earlier versions and it had not registered that it was no longer there because I still get a kernal boot message "Freeing initrd memory". However there is no initrd which is what I expect seeing as everything is compiled in.

OK, thanks, they're worth looking at. I re-timed the boot, and it's just under 30 seconds. Six are in the initial countdowns for the low level bootloader and uboot, both of which I can get rid of, about six are taken uncompressing the initrd and six goes initialising the temac ethernet adapter. Uboot also checks the compressed ram image which takes a couple of seconds and I could dump that - the system will either work or not, if the ramdisk image is corrupt there is no benefit in knowing early on, not a lot I cna do about it. If I can get rid of the ramdisk decompression time that would be worthwhile, so I'll definitely look at the other filesystems you mentioned. Once the application is running it uses no disk at all.

Thanks, Nobby

- D
- dawydiuk
  
  Contact options for registered users
Vote on answer
posted
14 years ago

Thu, Jun 11, 2009 12:19 AM

start

the

named

script

then

full

run

only

Get rid of U-Boot. Have your bootrom load your kernel in RAM and jump right into it. Then disable the interface for modifying the kernel arguments. Finally boot to an initial ramdisk, drop the user to the shell while you do other things in the background. This should get you to under a

5 second boot time.

Regards, Eddie

- C
- cs_posting
  
  Contact options for registered users
Vote on answer
posted
14 years ago

Thu, Jun 11, 2009 7:09 PM

On a full-blow desktop system, no, you shouldn't.

But embedded systems may use basically the same mechanisms for their ultimate run-time environment.

Or maybe it's better to say that the whole initrd scheme is basically about booting your desktop up in an embedded-system type of miniature configuration, that then bootstraps the full system.

- C
- cs_posting
  
  Contact options for registered users
Vote on answer
posted
14 years ago

Thu, Jun 11, 2009 7:19 PM

art

e

I have a blackfin uClinux system that I just clocked at 12 seconds from A/C power to user-mode application starting up, including 2 second u-boot delay. I'm not noticing any delay during kernel or ramfs decompression, but the size of the uImage as compared to the filesystem confirms that it's compressed.

I wonder if u-boot might be running the processor in some severely sub- optimal (but safe) way?

You may be able to try a newer u-boot without changing kernel versions and thus not really having to make changes to your application itself.

Also make sure the ethernet pause is just the hardware drivers, not something like a vestigial attempt to contact a DHCP server or something like that.

- C
- cs_posting
  
  Contact options for registered users
Vote on answer
posted
14 years ago

Thu, Jun 11, 2009 7:25 PM

ly

Another idea... could you have a severely bottlenecked interface to flash, either due to hardware limitations or more likely "safe" settings used during this early phase of operating? I'm thinking something like very conservative bus timing, using byte-wide access, an SPI device run at a pathetically low clock rate or one byte at a time, or some really running-around-in-circles block translation mapping... etc.

- C
- CBFalconer
  
  Contact options for registered users
Vote on answer
posted
14 years ago

Fri, Jun 12, 2009 12:34 AM

The use of compressed kernal code can easily speed up booting on a reasonably fast processor, because the sum of file read time + decompression time is less than a file read of uncompressed material.

--
 [mail]: Chuck F (cbfalconer at maineline dot net) 
 [page]: 
            Try the download section.

- N
- Nobody
  
  Contact options for registered users
Vote on answer
posted
14 years ago

Sun, Jun 14, 2009 3:15 AM

It isn't about size, but portability.

initrd was created so that you can create a kernel and filesystem image which will boot on a wide range of hardware. Before initrd, Linux distributions needed to provide kernels which had dozens of different IDE/SCSI drivers built in (this is before SATA). You couldn't provide the drivers as modules because you need to be able to access the disk before you can load modules.

initrd lets you mount an initial RAM filesystem containing IDE, SCSI and SATA drivers, plus the tools required to probe the hardware and load the appropriate driver. Once you've loaded the driver, you can get everything else from the hard disk.

If you're building a kernel for specific hardware, you don't need to use modules for the core hardware, or maybe at all. You just build what you need directly into the kernel. No modules means no need for initrd.

- C
- cs_posting
  
  Contact options for registered users
Vote on answer
posted
14 years ago

Mon, Jun 15, 2009 4:17 PM

I didn't actually say it is about size... but the argument you are going to make below actually does come down largely to size and ease of changing.

You still have to provide drivers for all the hardware you want to boot on in the image. The difference is that you can then discard the size of the modules that you don't use (for the hardware you don't have) when you free the initrd; with a monolithic kernel you can't free up the space of unused compiled in drivers (or do we do kernel paging? I must admit I don't know).

Also it's arguably easier to issue the cpio incantation to make a new image than it is to recompile the kernel to include different drivers.

Not exactly, because initrd is not confined to making modules available. Initrd also lets you run scripts to setup your root filesystem - you might need to bring up a network and mount it over a network filesystem. Or setup unionfs... Or re-image the machine from some internal or external archive... etc. You have all the functionality you choose to include - and as I started out by saying, many embedded systems just stop here and run out of the initrd or similar-idea ramfs or whatever permanently.

- N
- Nobody
  
  Contact options for registered users
Vote on answer
posted
14 years ago

Fri, Jun 19, 2009 8:54 AM

No, kernel memory is physical memory.

Right. But still, it comes down to portability. If you're targeting a specific piece of hardware, you know which drivers you need, and you don't compile in anything else.

The kernel can mount an NFS root filesystem directly, provided that you can get the network up using only what's built into the kernel; if you need to log into a VPN, that would require user-space support.

Probably the most common non-module use of initrd on desktop and server systems is if the root filesystem is on a RAID array, as these typically can't be initialised without user-space support. But that's atypical for an embedded system.

- C
- cs_posting
  
  Contact options for registered users
Vote on answer
posted
14 years ago

Fri, Jun 19, 2009 2:35 PM

t

You still have to "include" all of the drivers for the hardware you might need to be portable enough to boot from, the question is if you are compiling them into the kernel (where you can't ordinarily free the unused ones) or if you are cpio'ing them into the initrd.

Most common? Hmm, I wonder how many linux eeepc's (unionfs setup script in initrd) have shipped compared to servers with RAID arrays ;-)

Note that I was not describing this as typical for an embedded system, instead I was pointing out that more complicated systems may initially boot to a temporary ram root filesystem using similar or identical mechanism to what embedded systems use as their ultimate root filesystem. In other words, many complicated systems behave like embedded systems in order to bootstrap their full complexity.