Friday, August 28, 2009

A Job Where I Use Linux

OK, it's been a while since my last update. But that's OK because nobody is reading this blog anyway. In the space between my last post and this post, I have managed to secure a job where developing software in Linux is something I'll actually be paid for.

So, this post is mostly to mark this event in my life. I will now start updating more often as I am now in pretty deep learning mode. I'll see if I can do a daily post on one of the aspects I'm learning.

Thursday, July 16, 2009

Probing Works--But Still Error 12

I've been off on a little side project learning SQL for something sort of interesting. So I haven't been working on my Linux stuff. But this morning is a good time to check it out. Last time, I had finally gotten the new driver to compile after getting the detection code connected and got it loaded on my USB stick. Sooo....

I rebooted my computer and from the GRUB menu, typed in 'bootp'.

There is some searching for items on the PCI bus, but here is the most important output:


bus 02, function 00, vendor 8086, device 109a
Found Intel EtherExpressPro1000 Marks Kluge at 0x3000, ROM address 0x0000
NIC_LIST start
...
NIC_LIST vendor = 8086 dev_id = 109a ioaddr=3000
NIC_LIST end
Probing...[Intel EtherExpressPro1000 Marks Kluge]e1000: hw_addr=00000000, rom_addr=00000000
Can't do this! Mark.e1000: mmio_start = 00067c3c, mmio_len = 04000000
PCI latency timer (CLFI) is unreasonably low at 0. Setting to 32 clocks.
vendor = 00008086, device = 0000109a, revision = 00000000, subsys = 0000107b
e1000_set_mac_type
e1000: Unknown MAC type

Error 12: Invalid device requested

So, progress. But still a lot of questions. Why is a mmio_start not some nice round number? Why is the PCI latency timer 0? What is the PCI latency timer? Are 0 valid values for hw_addr and rom_address?

Hopefully, these are answers I can find. I've come pretty far. The light on the horizon is fading a bit, but it's not extinguished. Stay tuned...

Tuesday, July 7, 2009

Adding a Kluge

I added a couple of things to my drivers. The big one is this addition to config.c.

{ PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_82546EB_FIBER,
"Intel EtherExpressPro1000 82545EB Fiber", 0, 0, 0, 0},
{ PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_82546GB_COPPER,
"Intel EtherExpressPro1000 82546GB Copper", 0, 0, 0, 0},
/* Mark start */
{ PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_MARKS_KLUGE,
"Intel EtherExpressPro1000 Marks Kluge", 0, 0, 0, 0},
/* Mark end */
This makes it recognize my device by telling the pci.c to expect it. I saw the debug code print out that it found it, but I still got the Error 12: Invalid device.

So I looked and found another structure I hadn't updated yet in config.c:


static struct pci_dispatch_table PCI_NIC[] =
{
...
# ifdef INCLUDE_E1000
{ PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_MARKS_KLUGE, e1000_probe },
# endif /* INCLUDE_E1000 */
...


e1000_probe is the function that finds the device and actually assigns pointers to the correct driver functions (I hope they are correct enough) to the driver struct that holds pointers.

Baby steps. It's my first time working with drivers in Linux. It sure has its challenges.

So, I make. And as you might expect, I get build errors.

Seems the e1000.c uses pci_read_config_word whereas the rest of the files use pcibios_read_config_word. The same pattern applies to a few pci BIOS access routines. Those are easy enough with #defines for the time being. But there are 2 undefined functions--pci_bar_start and pci_bar_size. This will be a little tougher. There is code out there for VirtualBox. Is it the same thing?

Well, I guess I'll post this now and see what this bar address stuff is all about. Stay tuned.

Monday, July 6, 2009

Finding The E1000 HW

I've been of doing something else for a little while, so I haven't been doing Linux stuff. But as of today, I'm back at it.

So, to recap, I can get my USB drive to boot properly and it loads stage 2. But it can't find my Intel ethernet card when I enter dhcp or bootp. So time to see what I can get it to tell me. The first step--see if I can turn on some debug information.

builtins.c is where there bootp and dhcp commands are found. They probe the drive with eth_probe(...) in config.c and e1000_probe(...) in e1000.c. Turning on DEBUG caused the build to fail. But it was a simple fix--just a printf printing a non-existent field.

So, I'm now in the process of tracking down the problem. I was lead to pci.c where it seems to be failing to find the ethernet card. I can see it looking through the stuff out there on the PCI bus.

I see things that have the Intel vendor ID (0x8086) but none that have an e1000 device ID. So I headed off to check out /proc/bus/pci and found my e1000 NIC device. It has a device ID of 0x109a. Hmmm.... My driver has a list of devices but none go that high.

Well, it's 11:30 at night right now. I guess I'll see what I can do in the morning. But I'm off again. And making progress.

Wednesday, June 17, 2009

Solving Error 12: Invalid device requested

OK, I found a problem. INCLUDE_PCI wasn't defined because I didn't have INCLUDE_E1000 in the list of drivers that bring in INCLUDE_PCI. So all is not lost!

I fixed this issue and a little hiccup with device ID definitions not being found. But now ready to reboot and see what happens...

[It is now Monday. I have been busy this weekend and won't be getting back to this until Tuesday, so I thought I'd just publish as is.]

Adding Intel E1000 Net Adapter Driver To Grub 0.97

The nice thing about getting into Linux later on, a lot of the problems have been solved. Or, at least, other people have encountered the problems and gotten things into good enough shape that they can get by. And some of those efforts are googlable.

Last night as I was drifting off, I was looking for evidence that people want to use Grub with the Intel e1000 network driver that I have on my computer. And, yes, it is true--not only did people want it--one guy was desperate for it.

I found some files at a site and copied them into a directory in my patch directory. The driver files were simply added to my netboot directory where they all are. The rest of it was makefiles. I found it easier to just modify them according to the patterns--all drivers essentially have the same patterns in the makefile. The exception being that this driver requires one more command line define so I had to make sure it was in the right places. The config.c file had to get all of these strings that describe the driver in English for users to read.

Then I had to wrestling with the configure files. Grub lets you set up makefiles with ./configure with options. I was getting problems there until I did a make clean. Then, the makefiles were setup properly.

Next, I typed 'make' and held my breath.

It wouldn't be Linux if it compiled the first time, right? I needed to add definitions for uint32_t, uint16_t, uint64_t, uint8_t and int32_t. Easy enough, although I had to check /usr/includes/limits.h to confirm that an unsigned uint64_t would be an unsigned long long.

There were a couple of PCI constants that needed copying from the patch. And I needed to get rid of an __unused in an argument list. After that it compiled!

Woohoo! But will it run?

It runs! I typed 'bootp' and...
Error 12: Invalid device requested
Sigh. Well, this is going to take some effort to solve. So, my solution (or white flag) will be in a future post...

Tuesday, June 16, 2009

Grub Success--Now For The Next Challenge

I succeeded. Grub loads off my USB stick and goes into interactive mode. I spend an hour spinning my wheels in Grub because I forgot the 'sudo' and it wasn't able to see the drives. But, it sent me to gdb and I learned a little more about Grub. And, once I had that solved, it all came together pretty quickly.

In interactive mode, there is no (hd2). But there is an (fd0). This is because the USB drive is 0x00 and since bit 7 isn't set, it is regarded as a floppy disk. I need to figure out how to either remap 0x00 to 0x82 or else force stage2 to address fd0 in LBA mode.

I also tried the 'bootp' command and it said it couldn't detect the hardware. I had compiled in the driver for an older intel 10/100 connection whereas my computer has hardware for a 10/100/1000 circuit. I would think there would be some backward compatibility. But if not, I'll see what I can do.

But those are tasks for tomorrow.

Monday, June 15, 2009

Why Stage 1 Loads Only One Sector of Stage 1.5

I thought Grub was supposed to load a 1 sector stage 1 that loads a multiple sector stage 1.5. But, that's not how it works. It actually loads 1 sector of stage 1.5 (or stage 2.0) and then that sector loads up the rest of its sectors.

Now that mystery is solved. And start.S in the stage2 directory has the courtesy of saying that one of its data sections will be filled in by the 'install' command. Though it does do this:
/* restore %ax */
popw %ax
Hmm.... Now what was %ax, again? It would be nice to know. What's the point of a comment like that? I spent a minute tracking down the push to figure out what %ax was when it was pushed--the number of sectors loaded. Sheesh!

Oh, and I learned the point of the .mode byte in stage1.S. It is to tell the next stage if it can use LBA mode or has to use CHS mode.

So--my path from here is clear. I need to make sure the part of stage1 that gets patched by the grub utility with 'nop; nop;' doesn't hold code. I'll replace that whole chunk of code with 'nop; nop;" so the patch is rendered useless. The second thing I'll do is make sure it lets me into LBA mode with the bit7 unset. That's a little 2 line snippet of code that checks the bit and jumps to the CHS section if it is set. That will be excised. I'll load it up and see what happens. And in order to really confirm that I'm getting to my own stage 1_5 start sector, I'll modify it to spew the notification string to the screen in an endless loop. That'll be the acid test.

Trials, Tribulations, Successes, and STAGE1_BOOT_DRIVE_CHECK

I made a couple more images that go into my MBR to give me information. It took a while to learn the importance of not forgetting to place the '$' before a macro that evaluates a value. Without the '$' it dereferences memory and uses that dereferenced value instead of the value itself. And I solved other equally frustrating problems with the help of 'ndisasm' to see what I actually got compared to what I think I wrote. The results? Nothing I didn't expect. No reason emerged why my boots should fail.

So, I went back to stage1.S and decided to trace it through from beginning to end assuming my device can use LBA's.

A couple of mysteries. First,
lba_mode:
/* save the total number of sectors */
movl 0x10(%si), %ecx
%si is not set up, yet. This should be harmless (%si is at the end of the notification string), but it's seemingly useless.

The second mystery is why the code loads only one sector of the 15 following the MBR. I have to find out why.

A final thing I found--I was removing some code that deals with the upper bit of the current drive which it thinks ought to be set. The execution contains:
boot_drive_check:
jmp 1f
testb $0x80, %dl
jnz 1f
movb $0x80, %dl
1:
which appears to do nothing. But when you install from Grub, it uses:
/* The offset of BOOT_DRIVE_CHECK. */
#define STAGE1_BOOT_DRIVE_CHECK 0x4b
to wipe that first jmp 1f with nop; nop. This was wiping other stuff and that is surely one reason my boot loader wasn't working. It would have been nice to have a little comment there in stage1.S explaining that Grub might write two nops to the address STAGE1_BOOT_DRIVE_CHECK. Sigh.

Anyway, I worked on this for 8 hours straight. I'm learning Grub inside and out. But I can't wait to be done with Grub so I can move onto something more interesting and less frustrating than BIOS interrupt calls and boot loading. But, I have some nice leads to track down. I'll avoid the STAGE1_BOOT_DRIVE_CHECK problem. And I'll gdb grub and watch what happens when it runs 'install_func' which loads stage1 and 'embed_func' which loads stage 1.5.

Sunday, June 14, 2009

Bootloading From the Ground Up

So, I'm still groping in the dark. I might as well embrace the darkness and start to learn how to grope. Err....as it were. Anyway, my goal is make my own boot sectors that give me information about the disks known to the BIOS. It took a little work, and I had to get over a few obstacles, but I now have a sort of boot sector development platform.

The first obstacle--somehow, my BIOS lost the fact that it's supposed to boot from the USB before the main HDD. That was easy enough to fix. Now for explanations on the other, harder problems. Note that some of the solutions I came up may not be necessary since this issue is the cause of a lot of failed USB boots.

I want a boot sector image that I can 'dd' to the USB key. The Grub method requires that I copy the image to a specific directory in the USB, unmount the drive, run grub, set the root, and type out the command to write it. Too much effort. I stripped out most of stage1.S and added my own code to just write @xx@ where xx is the drive name. I placed it on the USB, but it didn't boot.

To make it boot, I added in .bytes corresponding to stuff the boot sector seems to need--partition table, magic numbers, BIOS sector. I did that. Of course, I still hadn't confirmed the USB drive was in the proper boot order, so it may not have been necessary.

But I finally did get it o run. Here is the code that shows me the drive number.

repeat:
movw $0x0001, %bx /* set page and FG color */
movb $0x0e, %ah /* int 0x10 Teletype output */

movb $0x40, %al /* '@' as a marker character */
int $0x10 /* print the '@' */

movb %dl, %al /* put drive in %al */
shrb $4, %al /* shift upper nibble to lower */
cmpb $9, %al /* compare to 9 */
js upperIsNum /* if %al is 9 or less, skip next command */
addb $7, %al /* %al is over nine--adjust to make it 'a-f' */
upperIsNum:
addb $0x30, %al /* adjust to make 0-9 '0'-'9' and 17-22 into 'A'-'F' */
movb $0x0e, %ah /* int 0x10 Teletype output */
int $0x10 /* print */

movb %dl, %al /* put drive in %al */
andb $0x0f, %al /* mask off upper nibble */
cmpb $9, %al /* compare to 9 */
js lowerIsNum /* if %al is 9 or less, skip next command */
addb $7, %al /* %al is over nine--adjust to make it 'a-f' */
lowerIsNum:
addb $0x30, %al /* adjust to make 0-9 '0'-'9' and 17-22 into 'A'-'F' */
movb $0x0e, %ah /* int 0x10 Teletype output */
int $0x10 /* print */

movb $0x40, %al /* '@' as a marker character */
movb $0x0e, %ah /* int 0x10 Teletype output */
int $0x10 /* print */
movb $0x0d, %al /* \r */
movb $0x0e, %ah /* int 0x10 Teletype output */
int $0x10 /* print */
movb $0x0a, %al /* \n */
movb $0x0e, %ah /* int 0x10 Teletype output */
int $0x10 /* print */

hlt /* halt */

jmp repeat /* repeat */

Allow me to make some observations. First, 'hlt' doesn't halt. Or perhaps it does but then causes a reset. Second, the cmpb operands are out of order because this is AT&T style assembly and not 'nasm' assembly and the operands are reversed. At least, I think, because the js should jump if the result is negative. Third, the int 0x10's are described in http://en.wikipedia.org/wiki/INT_10.

This was the hard part--getting this to work. Can I work essentially blind? I knew that once I got this to work properly, the next steps would be progressively easier.

The next job--use int 0x13 to read the drive parameters for all 256 drives. If the carry flag is clear upon return from int 0x13, then print out the drive number of number of heads. I did that and got the three I expected.

Then I wrote one that takes the drives that exist and tests for the CHS values. I'll provide the code for the section I wrote:

call PrintDriveParams

repeat:
call Home
jmp repeat

/* print a marker '.' */
movb $0x2e, %al
call PrintAlChar

/* print the drive */
movb %dl, %al
call PrintAlHex

/* print a marker '.' */
movb $0x2e, %al
call PrintAlChar

hlt

PrintAlChar:
pusha /* save all registers */

movw $0x0001, %bx /* set page and FG color */
movb $0x0e, %ah /* int 0x10 Teletype output */

int $0x10 /* print */

popa /* pop all registers */
ret /* return */

PrintAlHex:
pusha /* save all registers */
movb %al, %dl /* save %al */
movw $0x0001, %bx /* set page and FG color */
movb $0x0e, %ah /* int 0x10 Teletype output */

movb %dl, %al /* put drive in %al */
shrb $4, %al /* shift upper nibble to lower */
cmpb $10, %al /* compare to 10 */
js upperIsNum /* if %al is 9 or less, skip next command */
addb $7, %al /* %al is over nine--adjust to make it 'a-f' */
upperIsNum:
addb $0x30, %al /* adjust to make 0-9 '0'-'9' and 17-22 into 'A'-'F' */
int $0x10 /* print */
movb %dl, %al /* put drive in %al */
andb $0x0f, %al /* mask off upper nibble */
cmpb $10, %al /* compare to 10 */
js lowerIsNum /* if %al is 9 or less, skip next command */
addb $7, %al /* %al is over nine--adjust to make it 'a-f' */
lowerIsNum:
addb $0x30, %al /* adjust to make 0-9 '0'-'9' and 17-22 into 'A'-'F' */
int $0x10 /* print */
popa /* pop all registers */
ret /* return */

NewLine:
pusha /* save all registers */
movw $0x0001, %bx /* set page and FG color */
movb $0x0e, %ah /* int 0x10 Teletype output */
movb $0x0d, %al /* \r */
int $0x10 /* print */
movb $0x0a, %al /* \n */
int $0x10 /* print */
popa /* pop all registers */
ret /* return */

PrintDriveParams:
pusha /* save all registers */
movb $0x00, %cl /* cl = 0 (drive = 0) */
movb $0x3c, %al /* print '<' marker */ call PrintAlChar printHeadsLoop: movb $0x08, %ah /* int 0x13 Read Drive Parameters */ movb %cl, %dl /* dl = cl (dl = drive) */ pushw %cx int $0x13 movw %cx, %bx pop %cx jc notValidHead /* print drive number */ movb %cl, %al call PrintAlHex movb $0x3a, %al /* print space */ call PrintAlChar movb %dh, %al /* heads */ call PrintAlHex movb $0x20, %al /* print space */ call PrintAlChar movb %bh, %al /* cylinders and sectors */ call PrintAlHex movb %bl, %al call PrintAlHex call NewLine notValidHead: incb %cl /* inc cl (drive++) */ cmpb $0x00, %cl jnz printHeadsLoop /* next */ printHeadsDone: movb $0x3e, %al /* print '>' marker */
call PrintAlChar

popa /* pop all registers */
ret /* return */

Home:
pusha /* save all registers */
movw $0, %bx
movw $0, %dx
movb $2, %ah
int $0x10
popa /* pop all registers */
ret /* return */


The result was this:
<00: 08 F6F8
80: FE FEFF
81: FE 00FF
>

The drive geometry of my flash drive is: 255 heads, 31 cylinders, 63 sectors per track--max head of 254 (oxfe), cylinder of 30 (0x1e << 6 == 0x780), sector 63 (ox3f). So the CHS should look like FE 07BF.

So I don't trust any of those numbers. But CHS addressing is inferior to LBA addressing anyway. So, I need to see which drives are able to be addressed via LBA. I suspect all of them. Time to start using int 0x13 with %ah = 0x42. But that'll have to wait until tomorrow morning.

Saturday, June 13, 2009

An Ubuntu Annoyance

The makers of Ubuntu--or probably the Gnome desktop--decided that it would be cool if you could make multiple desktops that you can move between. They are correct. They also decided it would be cool if you could scroll between these with the mouse wheel. They are not correct. It's annoying as hell. Especially as any region of a window that doesn't accept scroll wheel messages passes them onto the desktop rather than quashes them.

The result is that I'm always scrolling my desktop when I mean to scroll my current window. I can't find any obvious way to disable this horrible behavior.

Friday, June 12, 2009

Friday Night Grub

OK, it's Friday evening and here I sit in front of my computer. I spent the afternoon reading Wrox's Professional Linux® Kernel Architecture weighing in at 1337 pages (was that 'leet' intentional?). So I needed a break from thinking of technical stuff--though during that break I read up on Red Black trees which are binary trees that have some simple rules which when violated are easy to detect and easy to fix and the act of fixing them keeps the Red Black tree relatively balanced.

Anyway, I am still working on my Grub problem. I want Grub to load in a stage 1.5 from my USB key and run properly. Last time, I got it to stop loading in the wrong 1.5, but it still doens't want to load the 1.5 that does what I want it to do. The cursor just sits there and blinks at me. And the only way to debug and troubleshoot is to compile code, load it on the USB drive and reboot. So while everyone is out having fun, I'm here at home. In my defense, I do have some Granville Island Maple Cream Ale in the fridge.

So, I wanted to confirm exactly which drive the USB stick is. So, I added the following code to stage1.S right before the notification string gets written. Note that %dl contains the booting drive:

/* update notification with drive info */
movl $ABS(notification_string), %ecx
movb %dl, %al
shrb $4, %al
andb $0x0f, %al
orb $0x30, %al
movb %al, 0(%ecx)
movb %dl, %al
andb $0x0f, %al
orb $0x30, %al
movb %al, 1(%ecx)
What this does is to modify the notification string to create a sort-of hex representation of %dl. It won't be hex for nibbles greater than 9, but I can understand 0123456789:;<=>? instead of 0123456789abcdef.

I ran it and a few other related tests. The screen gets messed up but I had to remove some stuff to fit all this in. But I'm seeing '00'. So %dl is 0. I'm not sure if that's good or bad--but I think bad. The harddisk is supposed to 0x80. The upper bit is a flag that indicates a harddisk. So this essentially is saying the first disk is a floppy. In previous posts, you learned that I tried hardcoding the value to 0x80, 0x81, and 0x82 all to no avail.

But, not all is lost. I installed the nasm package which comes with ndisasm. That means I can disassemble the MBR of my USB boot from the USB Startup Disk Creator and reverse engineer that value--or enough information to get me further along.

OK, I spent some time reverse engineering the thing. Here are a couple of useful links:
http://en.wikipedia.org/wiki/INT_13
http://en.wikipedia.org/wiki/INT_10
http://en.wikipedia.org/wiki/X86
http://wiki.osdev.org/ATA_in_x86_RealMode_(BIOS)

OK, I spent some time disassembling and getting more and more acquainted with the int 0x13 and int 0x10 system calls that allow low level access to the hardware. Very interesting.

But let me make an observation that I just noticed--the Grub bootloader forces CHS mode as opposed to LBA mode when bit 7 isn't set on the drive. So my USB stick will be addressed in CHS mode and this will require the geometry (heads=255, cylinders=31, sectors/track=63). I think I'd prefer LBA mode and I'm pretty sure a USB key would support LBA mode. So I'll try to comment that out an see if it boots in LBA mode... (LBA mode lets you access the blocks on the disk as sequential numbers from 0 to 1-the number of LBAs. CHS mode forces you to address a block with a word whose bits are divided into three fields that don't necessarily go to the top of the range and, for sector, isn't even zero-based.)

So, I removed the part that forces my USB drive to go into CHS mode and it still didn't work. (btw, it is now Saturday afternoon and no longer Friday night). That tack having failed. I am off to write up some stage 1's that do nothing but tell me about my BIOS and my USB. That will be in a future posting.

Live CD on USB Stick

I dd'ed my 1GB USB to back it and used the USB Startup Disk Creator to make my USB stick bootable. It worked. Pretty nice. It even uses the rest of the USB as storage. Excellent option for moving around to different PC. Maybe even for a netbook. The best thing is that you can use it to boot a PC with a trashed boot sector or another major problem.

Thumbs up on the USB Startup Disk Creator.

But, it doesn't use Grub, so no insight on my Grub problem.

Wednesday, June 10, 2009

日本語

I found that installing Japanese language support wasn't as hard as I thought it would it be. It's only hard if you do it manually. But if you go to System>Administration>Language Support, you can install it through the GUI. So now I have it. Yay!!!

Thinking About Network Driver for Grub, Part III

Recall from Part II that my USB stick seems to be booting, but control always goes to the menu.lst on my harddrive and not on the USB stick.

I decided that to truly understand GRUB, I have to study the source code. The documentation was too imprecise. So I started with stage1.S which is the only file that generates the small stage 1 and asm.S and stage1_5.c which are part of stage 1.5. It seems that the booted drive comes into the code in stage1.S with %dl set to the drive. If in the grub install command, you use the 'd' option, if will put a drive number into 'bootdrive:' which happens to be at address 0x40 of the first sector. Ah ha! But alas, it is 0xff on my dd'ed image of my USB stick, so there is no forcing of a load of the wrong stage 1.5.

Also, using GRUB installer, dd, and ghex2, I'm getting a better handle on the whole process. If I set:
root (hd2)
grub will use my USB key, and then if I use the --prefix=/boot/mark in the setup command, grub has everything look in (hd2,0)/boot/mark both during the install (i.e. when grub copies and modifies the stage1 and stage1.5 files to the USB key. Note that it only embeds the stage1.5 file when the device is the whole drive and not a partition. That is:
setup --prefix=/boot/mark (hd2)
will cause a stage1.5 to get installed, but
setup --prefix=/boot/mark (hd2,0)
will not because stage 1.5 on goes from the master boot record and not a volume boot record.

Anyway changing the GRUB to MARK in stage1.S and stage1_5.c, I can see whose stage1 is getting used and whose stage 1.5 is used. So far, it's looking like my stage 1 is pulling in the HDD stage 1.5.

But if it isn't the MBR that's causing it, but the first partition on the USB where there is no stage 1.5, then I'm not sure what's going on.

Another confusing this is that I saw one website that said that the booting drive is always (hd0). Is %dl 0 instead of 2 when it is passed to stage1? And if so, whose stage 1.5 should it be getting: (hd0)'s or (hd2)'s?

Maybe I should make a test stage1.S that writes to screen which drive number it thinks it is. That means Intel assembly language programming. Yuck! I think I'll also make sure that my MBR stage 1 and partition 0 stage 1 show different notifications.

OK, so I got an MBR stage 1 with a notification of "MBRu" and the first partition volume boot record with a stage 1 notification of "MARK". Time to reboot and see what's up... See you on the other side.

Back. OK, I saw "MBRu", so I have confirmed that it is using the MBR stage1. But it is still loading stage 1.5 from the HDD. Time to check into the 'd' option on grub's 'install command'. The d option lets you force it load stuff from a specific disk. OK, I used it by doing a 'setup' to get the string for 'install' (it does the install for you and shows it to you), and then added a d. Then I dd'ed and ghex2'ed the USB drive to confirm that the bootdrive is 0x82 instead of 0xff. And it was. Yay! Time to reboot and see what happens....

Two steps forward, one step back. It went to MBRu and froze. Now I'll try it by pulling my 1TB USB drive and see how it likes a bootdrive of 0x81. Confirmed with dd and ghex2. Ready to try again....

...and no-go. Doesn't work. Still sits there at MBRu blinking a cursor at me. /sigh It's not loading the wrong stage1.5 anymore, but now I have to figure out what's happening when all I have is a blasted cursor.

I'm hungry, so that'll be for part IV, (hopefully) A New Hope.

Monday, June 8, 2009

Thinking About Network Driver for Grub, Part II

In Part I, I had screwed up my bootloader and could no longer boot into Ubuntu. Fortunately, that problem was easy to fix with a bit of a side-effect.

I booted up the Ubuntu Live CD. From there:
sudo grub
which put me into the Grub thingy:
root (hd0,2) setup (hd0)
did the trick. The side effect--instead of starting to boot into the Windows bootloader, it starts to boot into the Ubuntu bootloader. So, apparently it touched the first partition. I was hoping the root (hd0,2) would make it not touch anything but the sda3 partition. Maybe it just changed the active partition? I should look into that.

Anyway, I can back into Ubuntu and Windows. Whew!

Time to put on my thinking cap. I found a couple of sites that said that when you do the netboot, stage1 loads stage2 from tftpserver. Huh?!?! Stage1 is 512 bytes and that includes the partition table. I don't believe it. I looked at the code. Stage 1 is all in assembly and makes a few int $0x13 calls, but there is nothing that indicates it can set up a tftp session. And the makefile defined NETBOOT macros but doesn't use them. They must mean that nbgrub is loaded during stage2.

Anyway, I modified my /boot/grub/menu.lst on my HDD Ubuntu drive and explicitly rebooted from the USB stick. It used the modified menu.lst, so I know that wherever stage1 is coming from, stage2 is definitely coming from the HDD. There are a few possibilities:
  1. The USB key stage1 is loading the HDD stage2
  2. There is an error and the fallback is the HDD
  3. The BIOS doesn't actually boot from the USB key event though it's in the list
I decided to test hypothesis 1 by changing my BIOS boot order to USB Stick, CD/DVD, Harddisk. When I boot with the stick in, it leapfrogs the CD/DVD. So it looks like my stick's stage1 is loading from the HDD.

What abuout grub itself. I learned a new command: which. I can type:
which grub
and it will tell me where it find the grub it's running. It's the system's grub. When I try to explicitly run my compiled grub, I get a segmentation fault. Running it under gdb, I find it's crashing because of a corrupted pointer in grub_printf when it tries to handle the %s by passing the incoming string to grub_putstr.

It seems gcc once compiled:
void my_printf(const char* format, ...)
{
int* dataptr = (int*)&format;
dataptr++; // now it points to the args of the
// args to be used for the format
// string's %s, %d, etc
}
but from gcc 3.4, according to the google-verse, this "cowboy" method no longer works--hence there are patches.

It's progress. And I'm reacquainting myself with gdb which I haven't touched in ages. Some people in Turkey working on pardus have a patch which forces GRUB it to use varargs. I downloaded a patch file from their svn (after allowing Firefox to let them use a self-signed certificate). Actually a google search on "grub varargs" turns up patches in places without Firefox getting pissed off about self-signed certificates.

And I get to learn a new Linux command: patch. OK, I have used it before since Subversion is based on it and I've used it there, albeit in Windows.
patch --verbose -p 1 -d ./grub-0.97 -i patches/varargs.diff
Did the trick except for one file which I had to do manually, though I'm not sure what the problem was--perhaps whitespace since it looked the unified diff should have matched. I had to guess on the -p 1 and of course I practiced with the --dry-run option.

So, off to make, but, naturally, it fails. First, misc.c doesn't have the #include , so I added it, but the worse problems seems to be the -nostdinc compiler option that's set all over preventing it from being found. I guess the patch isn't exactly complete or doesn't cover my set of code. Then I had to remove the -nostdinc from stage2. Once that was done--it compiled to the end!

But will it run? Yes!!! It even seemed to install stuff on my USB key. Now to shutdown and test.

And this seems like a good place to end Part II.

Useful links:

http://www.linuxselfhelp.com/gnu/grub/html_chapter/grub_3.html

http://www.freesoftwaremagazine.com/articles/grub_intro

Using testdisk and gparted to Fix My USB Stick

OK, I messed something up. I used 'dd' to write to /dev/sdc aka my USB key. Something like this:
dd if=./stage1/stage1 of=/dev/sdc
dd if=./stage2/stage2 of=/dev/sdc seek=1
Well, it seems, I wiped out the partition table or something. When I came back to Ubuntu, there was no icon for the USB key. When I went to gparted, it was there, but instead of a partition, there was only an 'unallocated'. I tried to use gparted's "Create Partition Table..." but I just an error message about being unable to access /dev/sdc.

To google!

Naturally, I'm not the first moron to screw up a USB key. I get to stand on the shoulders of morons before me. They asked, and the google-verse responded: testdisk.
sudo apt-get install testdisk
testdisk /dev/sdc
It took a little while to get my testdisk legs. But soon I found that you can wipe out the MBR and then select 'none' as a file system, and then recreate a boot sector as FAT16 or something like that. I still couldn't add a partition. But at least I repaired it enough that Linux regards it as FAT16.

I logged out and this time gparted allowed me to make a partition. Whew! Now that I have it working again, maybe I'll do:
dd if=/dev/sdc of=/home/mark/backups/Cruzer256.bak
so if I wipe it out again, I can do:
dd if=/home/mark/backups/Cruzer256.bak of=/dev/sdc
Now back to figure out where I went wrong in causing this problem in the first place... Maybe I should have done:
dd if=./stage1/stage1 of=/dev/sdc1
dd if=./stage2/stage2 of=/dev/sdc1 seek=1

Thinking About Network Driver for Grub, Part I

GRUB, a bootloader for Linux, doesn't come with network configuration installed. If you want it, you have to compile your own. And when you compile, you have to choose a driver for your hardware. Sounds easy enough, right? Wrong. You have to figure out which hardware you have and find the driver which may not even exist.

How do I know which driver I have? Well, there's probably some easy Linux way to figure it out, but I decided to boot into Windows and use the Control Panel to bring up the Device Manager.

I have 3 Network Adapters. One is wireless and the other two are a "1394 Net Adapter" and "Intel (R) PRO/1000 PL Network Connection". A little googling and I found that 1394 is for Firewire. The Intel connection has a PCI location, so it seems like a physical piece of hardware on my motherboard. So I'm guessing that's it. The computer I want to boot is "Intel (R) PRO/100 VE Network Connection".

Well, there is one, Intel Etherexpress Pro/100 which requires sending "--enable-eepro100" to the makefiles. I'm going to guess that's the one I need. Even if it's old, I'm just doing simple stuff, so it should work.

So back to Ubuntu...

I figured out that I can use .configure with my necessary --enable-xxx options and it will setup my makefiles for me so I can just type 'make'. Or at least that seems to be what it does. I ran make and, of course, it failed. There are 2 variables that are declared as extern in etherboot.h but defined as static (file scope) in main.c. Hmmm.....

I suppose I can just make the static variables unstatic...? Well, it seems I'm not first with this problem. These guys had the same problem and that's their solution: http://www.dietpc.org/build.htm

For a little context, they claimed it was because of gcc 4.x. I guess the compiler considers that an error from 4.0 on. I'm not sure I'd really consider it an error, but I can see the rationale.

Anyway, once that's done, it compiles. It creates a whole pile of images. I'm hoping that the stage2 is the one I want. So, I took an old USB key I have laying around, used gparted to format it as FAT16 and set the lba flag. Then I dd'ed stage1 to lba 0, and stage 2 to lba1+.

Now to shut down my system and see if I can boot from it....

Back. GRUB loaded as evidenced by the fact that the word GRUB appeared in a solid wall of the word GRUB. About 16 GRUBs per line and scrolling fast. So, I'd say, I had moderate success. The BIOS obviously found GRUB and GRUB is obviously doing something--just not what I want it to do. But I'm on a journey of discovery, so this is progress.

A little stumbling block here. I screwed up my USB key. http://marklearnslinux.blogspot.com/2009/06/using-testdisk-and-gparted-to-fix-my.html

I think I got it working now. I used grub-install /dev/sdc instead of dd'ing the stage1 and stage2. I'll reboot and see what happens...

OK, it booted up and said GRUB, but then didn't do anything. I was hoping for the menu or the prompt or something asking for more configuration information. But no.

Then another disaster--it seems I got installed that GRUB to my sda boot sector as well so my Ubuntu boot goes to "GRUB _". Argh! Now I have to boot into the Live CD and try to restore the normal bootloader. Sigh....

Well, I guess I'll post this and continue on a future post.

Nice links:

http://www2.informatik.hu-berlin.de/~draheim/boot/grub-netboot.print.html

http://www.linuxhq.com/ldp/howto/Network-boot-HOWTO/index.html

http://www.sfr-fresh.com/linux/misc/grub-0.97.tar.gz/

http://linuxgazette.net/issue64/kohli.html

dd: Like Fire for a Caveman

Before I could install Ubuntu on my harddisk, I needed to make a backup in case stuff went completely wrong. It's not easy finding a good way to do that. Most googling turns up recommendation for expensive packages that I don't want to pay for. I eventually went for the free Macrium Reflect.

Part of the installation process was using the 'dd' command to get an image of the boot sector. It was something like this:
dd if=/dev/hda3 of=/mnt/backup/bootimage.bin bs=512 count=1
That means that the input file is simply the partition /dev/hda3, the output file is the Windows backup sector that I mounted into the /mnt directory, the number of bytes per sector is 512 and the number of sectors is 1.

Why not use 'dd' to do a backup. With the Ubuntu live CD, all you would need to do is:
  1. create a directory where you want to mount the drive to store the backup
  2. mount that drive
  3. dd the whole partition to save into the backup drive
To restore:
  1. create a directory where you want to mount the drive where you stored the backup
  2. mount that drive
  3. dd the image back from the backup drive onto the partition to restore
Also, you can apparently treat the entire disk as a file and back the whole thing up--not just the partitions. This is good if you want to make an exact copy of the same disk. But it won't necessarily work if the disk is replaced by some other disk off the shelf at the local computer store and has different geometry.

The dd command is very useful and very powerful. It is also very dangerous. One best type very carefully and double-check everything before pressing 'enter'.

Sunday, June 7, 2009

Eclipse

Eclipse is an IDE that is highly configurable through plug-ins. I have 2 reasons for learning how to use it:
  1. I would like to use it to point to my cross-compiled uClibc toolchain so I can build projects to learn embedded Linux
  2. I am working on a little VM of my own design and I'd like to make a syntax highlighter for my assembly and use it to assemble them for running in the VM
So I went through the 2 hour download in Synaptic. Now it's installed and I'm going through the extensive documentation. I must say, that even thought the docs are extensive, I don't have enough context to really have a grip on them. So, I think I'll start with a nice little project of attempting to compile my Hello World project.

It was a little disorienting at the beginning. The CDT package has a lot of stuff. I used the managed make system where Eclipse makes the makefiles for me. But eventually I got it figured out.

I had to do a few things to get it working. First, I needed to get my toolchain binaries into a directory. The binaries seem to all be under a directory called 'staging_dir' so they are all together. The other thing I had to do was set up macros and environment variables in Eclipse as well as point to the binaries.

Now that I've played with it for a little while, I learning a few things. First, it seems to matter where you build the toolchain. I built it and then moved stuff. But probably the thing to do is build it somewhere and create directory links to the important directories. The compiler seems to know where stuff was originally and it isn't happy when I move it. The libraries, I can deal with by setting up search directories in the linker settings, but the crt1.o seems to need to be in the same place.

So now to figure out how to make a link. I think it's 'ln' something....

Road Map

Since I want to learn how to use Linux as an embedded system tool, I will attempt to create a small Hello World project. Here is my roadmap. The order may vary.

  1. Prepare my cross compiler toolchain
  2. Compile a kernel
  3. Compile GRUB with network support
  4. Install GRUB onto a USB stick
  5. Set up my laptop to serve the kernel over tftp and serve a net filesystem to be the root filesystem
  6. Setup CVS
  7. Prepare Eclipse to compile with the toolchain for my Hello World project
  8. Write a program that writes "Hello World" to the screen
  9. Plug the USB stick into Elena's computer, restart it, and see "Hello World" on the screen
Looks simple enough, eh? Especially that last step.

Tinkering During the Install

So while I was waiting for my toolchain to compile, I decided to play a bit. First, I confimred that I can see East Asian fonts by googling Yomiuri Shimbun. And voilà! There it was: 読売新聞. It looks right on my computer--if not necessariy on the reader's. And I learned about keyboards--specificically, the compose key. I assigned the compose key to my right 'ctrl' key which I never use anyway.

The compose key lets you use two keys to create a character otherwise not on the keyboard. This includes all the accented characters in French which I like to type correctly. It also works on less common languages like Romanian--I can actually type Vlad Ţepeş. There are cool sequences for all sorts of characters--the following list of triplets is the two characters to type after hitting 'compose' and the resulting character: Or®, Oc©, tm™, !!¡, ??¿, <<«, >>», c=€, oo°, y=¥, -l£, xx×, +-±, -:÷, muµ, 12½, 13⅓, 14¼, 34¾.

I also got Thunderbird installed so I can read my email.

I added a calculator, terminal launcher, and Thunderbird to the panel.

I'd like to install a Japanese input method, but setup seems a tad complex so I'll do that later. I have enough to do right now...

Compiling the Toolchain

Since my goal is development of embedded systems, I decided to use the uClibc library and compile a cross compiling toolchain to target an i486 with uClibc. I downloaded a compressed version of the library and had to decide between a .tar.gz and a .tar.bz2. Like I know the difference! went with the bz2. It took a while to download. I unzipped and ran:
make menuconfig
I selected some options I'd like to have access to and then ran:
make
It didn't take long to fail. There was no 'bison'. No 'flex' (didn't I try that before?). A few other missing items. I googled and found out how you install a needed binary:
apt-get install name-of-the-missing program
There is a slightly overwhelming GUI for it as well where you can look at the CD (I think) or the universe or the multiverse. I'm still fuzzy on the difference between the universe and the multiverse... Anyway, after a few make/apt-get cycles, it was off. But it was late--about 2am. So I went to sleep.

When I woke up, it was stopped--on an error. I knew it wouldn't be smooth. Apparently, I need to install g++. Really? I installed them. And restarted. Now it would compile for an hour or so between failures. It did a lot of downloading. Some of the failures were tricky to figure out. For example, when autoreconf was missing, I typed:
apt-get install autoreconf
but it didn't work. I googled and learned what I had to enter:
apt-get install autoconf
so I was off again.

I had some problems with the Ruby install--I think it was trying to use Ruby scripts with the newly compiled Ruby which would sort of work since it was a i486 build, but I'm not sure the libraries would link correctly on the host. I removed it from the .config with make menuconfig to get it past the error.

By the time the afternoon rolled around, it finished. So, now I have a toolchain that may or may not work. Time to start other stuff.

Saturday, June 6, 2009

Repartitioning

If you're going to install Linux on a laptop, you'll need to repartition your hard drive. And if you're going to repartition your hard drive, you gotta backup in case stuff goes really wrong OK, I should be backing up anyway, but all my real computer stuff is in storage in Seattle and I'm just in Vancouver with my laptop.

So, off to Future Shop where I found a 1TB USB drive for CDN$130. Then I went to Best Buy where I found nothing priced as well. Then to London Drugs where nothing even came close. So I got 1TB of storage.

So, time to backup my C drive. Hmmm.... How to do it.... I guess I know how to back up. But what do I need to do to restore? Windows XP provides a backup utility in Start>Programs>Accessories>System Tools, but the option in which you make a full backup requires a floppy disk. I googled all over and everything seemed to indicate a floppy was needed. I started the full backup and went back to google--there has to be a way to trick it. Ugh!!! Nothing. In fact, one forums even said that some USB floppy drives won't be recognized. I can do a CD or a USB stick. But I have an equal number of 5¼", 3.5", and 8-Track tape players--all zero. This is insanity!

Fortunately, my friend Cathy saw my problem in my Facebook status and suggested Macrium Reflect which is free. I downloaded it and did a backup. It didn't work perfectly the first time, but I was using the computer. Once I stopped and let it do what it had to do, it was happier. Then I made a Recovery CD. The recovery CD booted and recognized the USB drive and found the backup. So I figured I was in good shape.

I defraged my C: drive a few times until it was pretty packed to the lower part of the drive. Then booted into Ubuntu from the CD. I ran gparted. I had read a page about how to do this. Actually, I think Ubuntu makes it even easier than the way I did it where you tell the installer how big to make the partition and it does the work. But I used gparted to make an ext3 partition of 20GB and a linux-swap partition of 2GB.

The repartition was over--time to see if Windows was OK. Upon booting, XP noticed its new smaller digs. It did a disk check and accepted its new situation gracefully.

Back to the Ubuntu Live CD. Now, time to install. The install was pretty straightforward. I assigned it to the ext3 partition. It complained that I hadn't marked it for formatting which I figured out.

The boot issue then came up. I had read earlier that you don't want to wipe out the MBR of the C drive. The website I read suggested you put the boot loader in another partition and then use the Linux 'dd' command to make an image of the first sector and turn that into a file which can be given to Windows and referenced in the boot.ini file. So I installed the bootloader on the ext3 partition. Then I mounted a fat32 partiton for the Windows recovery and then ran the dd command.

Back in Windows, I found the file I had created on the D: drive and copied it to C:. Then added a line to the boot.ini.

Upon booting, I could boot either windows, or kick from Windows to the Linux bootloader where I could boot Linux. I haven't yet, but I think I can make the Linux the default and shorten the timeout so that I can skip straight to the Linux bootloader.

Ubuntu on Live CD

There are a lot of book on Ubuntu at Chapters. I went to the section, cracked open a book and started to read it. I learned something promising--you can boot Ubuntu on a CD and it won't touch your harddisk. This lets you try it out and kick the tires a bit. Nice bit of information. It said that you can use the CD in the back of the CDN$35 book or download it off the internet. So I shelved the book and headed home.

Soon I was downloading an Ubuntu .iso file. Sadly, we only have Shaw LiteSpeed so it was a 6 hour download. But soon there was an iso sitting there in my download directory. Now to burn it to a CD-R. I opened the directory of the CD drive and dragged it over. It said something about losing information. Hmmm.... I better not burn anything yet.

Nero? Can't find it.... Nothing on my Windows XP system looks like it will burn a .iso CD image file to a CD even though Windows XP lets you burn files to a CD. OK... To google!

But there is hope! http://isorecorder.alexfeinman.com/isorecorder.htm

I downloaded Alex Feinman's program and it adds a menu item when you right click a .iso file that lets you burn it to the CD-R. Why is this not included with Windows XP?!?! Anyway, it solved my problem. In a minute or so, I had an Ubuntu CD.

I went to restart and waited to see what happens. Windows shut down and in seconds I saw the word Ubuntu in it's Orange/Yellow/Red color scheme across my screen. A little musical thing played, and then the screen went funny. It was funny for about 5 seconds, but then a desktop appeared. Cool. I found the terminal program and typed 'gcc'. It's there. Nice. I typed 'flex'. It's there too. Very nice. So video works, audio works. But I went to the network stuff in the Admin tools and couldn't get my WiFi connection. There was no list of WiFi points. Oh well. I shut down and went back to Windows.

I went to see how to configure WiFi and didn't quite have enough context to fully understand what I was seeing, but going back to Ubuntu, it became clear. You have to click the little strength bar icon. Then all the available WiFi points came up. I found mine and connected without a hitch. Then I logged into Facebook and it worked perfectly.

Cool. I would like to install this.

Buying and Reading "Building Embedded Linux Systems"

Last week I picked up Building Embedded Linux Systems from Chapters at Robson & Granville and headed to the Yaletown Starbucks to read it. It was a sunny day--a wonderful day to learn something new.

The concepts were familiar. I worked on project at my last job that used rather similar steps to bring up and develop a system, except with a different RTOS. One of the bigger differences seemed to be that Linux is more dependent on a filesystem than the RTOS's I have used thus far. And, the idea of using a netfile system is interesting.

I learned a bit about boot-loaders. I had used Lilo to dual boot my Windows 98 / Redhat box. But I wasn't aware of the features available in a full featured boot loader.

The cross compiling tool chain construction with smaller libraries was interesting. I am used to getting a tool chain, so it is interesting to think about the issues of cross compilation and generating a tool chain.

But soon my coffee was empty and I had to go home. I was engrossed in the book for the next couple of days. The concepts were geling quickly. I was getting excited about trying out some of the ideas. But all I have is a laptop...

Welcome

I used to use Unix back in my university days. I was never a power-user, but I knew the basics. A few years later, I did a dual-boot Windows 98/Redhat Linux on my PC. The thing I remember most was spending hours just trying to get my %&($^# mouse to work. I eventually did, and I toyed with it a little, but I was in Japan and very busy with work (yes, I did work those long hours). Also, my internet connection was Niftyserve--a sort of Japanese Compuserve on dialup and you paid by the minute there. Some of my friends were just starting to get excited by growth of broadband into their neighborhoods. And google either didn't exist or was not yet on my radar. Finding information was difficult. So playing with Linux wasn't something I pursued seriously. And since then, I never did much with Linux.

I progressed as an embedded systems developer where resources were limited (we wrote a complex multithreaded program in 128KB ROM 2KB RAM). But, in many cases, the hardware has caught up to PCs and Linux has been a viable option for even relatively small devices. So, in an effort to catch up my own skill set, I will be trying to figure out Linux in the context of embedded systems.

This blog will track my progress.