You are here: Home Journal
About the Author

Khairil believes that one can earn a decent living and enjoy life while enriching the wider community.

 

He currently works as an IT consultant for Inigo, advocates FOSS and free knowledge and culture.

 

Contact

Commenting is enabled with OpenID login.

 

All content on this site shared under the Creative Commons Attribution License 3.0

.

 

This is a personal site, comments and content here does not reflect the official viewpoint of Inigo Consulting.

OpenID Log in

 

Kaeru's Online Journal

Online journal entries sorted by date

Showing blog entries tagged as: FreeBSD

Network media status and settings

Posted by Khairil Yusof at Mar 16, 2010 11:05 AM |
Network Switch Lights

Most of us now work with relatively large amounts of data, whether it be media or data. I've been on a Gigabit Ethernet switch now for a few years, because transferring data or virtual machine images of several gigabytes over the network is painfully slow at 100Mb/s (12.5MB/s max). If you see this limit when transferring files with GigE equipment and Cat5e/6 cables, chances are auto-negotiation is setting a conservative limit.

One usually thinks of wired connections as relatively plug and play, and that's true for the most part. Unfortunately, I found out recently, that at least on my Ubuntu Linux workstation, with cheap networking equipment such the RealTeks, the Lantecs and what not that you have at home, the defaults may set your media speed to 100Mb/s (Fast Ethernet) and not 1000Mb/s (Gigabit Ethernet).

These days you do not need to look at blinking lights to see if stuff is connected (usually).

Checking and setting Ethernet media status on Linux

sudo ethtool eth0 (or your ethernet device):

Settings for eth0:
    Supported ports: [ TP MII ]
    Supported link modes:   10baseT/Half 10baseT/Full
                            100baseT/Half 100baseT/Full
                            1000baseT/Half 1000baseT/Full
    Supports auto-negotiation: Yes
    Advertised link modes:  10baseT/Half 10baseT/Full
                            100baseT/Half 100baseT/Full
                            1000baseT/Half 1000baseT/Full
    Advertised auto-negotiation: Yes
    Speed: 1000Mb/s
    Duplex: Full
    Port: MII
    PHYAD: 0
    Transceiver: internal
    Auto-negotiation: on
    Supports Wake-on: pumbg
    Wake-on: g
    Current message level: 0x00000033 (51)
    Link detected: yes

You'll notice that Speed here is at 1000Mb/s. Initially it was at 100Mb/s by default on mine.

Setting it is rather straight forward, with speed defined in Mb/s:

sudo ethtool -s eth0 speed 1000

The man page for ethtool is actually friendly with examples, something that often isn't the case in Linux.

You probably want to set this as default on startup, in something like rc.local.

Checking and setting Ethernet media status on FreeBSD

ifconfig command on FreeBSD generally provides all this info for you:

re1: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 7200
    options=389b<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,WOL_UCAST,WOL_MCAST,WOL_MAGIC>
    ether 00:13:f7:3a:80:f3
    inet 192.168.1.1 netmask 0xffffff00 broadcast 192.168.1.255
    inet 10.1.1.1 netmask 0xffffff00 broadcast 10.1.1.255
    inet 10.1.1.2 netmask 0xffffffff broadcast 10.1.1.2
    inet 10.1.1.3 netmask 0xffffffff broadcast 10.1.1.3
    inet 10.1.1.4 netmask 0xffffffff broadcast 10.1.1.4
    inet 10.1.1.5 netmask 0xffffffff broadcast 10.1.1.5
    inet 10.1.1.6 netmask 0xffffffff broadcast 10.1.1.6
    inet 10.1.1.7 netmask 0xffffffff broadcast 10.1.1.7
    media: Ethernet autoselect (1000baseT <full-duplex>)
    status: active

No problems here, 1000baseT as default.

A bit of tuning and jumbo frames

More on jumbo frames (Wikipedia) and their benefits.

By default in this case I get an increase in speed from Fast Ethernet (~25MB/s), but you can tune things further. One tuning option is to enable jumbo frames. The default MTU is only 1500. Most of us at home are likely to be using some sort of RealTek card. Usually MTU of jumbo frames is 9000, but RealTek cards only support a max MTU of 7422. On Linux the max is 7200 and on FreeBSD 7422 for RealTek. So I set both at 7200.

Setting the MTU can be done graphically or via ifconfig on both operating systems.

Now I'm getting around 40MB/s which is about 330% increase in speed from initial default setting of 100Mb/s on Linux.

ZFS on FreeBSD and Benefits of Software RAID

Posted by Khairil Yusof at Feb 28, 2010 01:35 PM |
Filed under: ZFS, FOSS, FreeBSD

This was an unplanned journal entry. I wasn't planning on an upgrade and update to my home server which runs on FreeBSD. Bad things seem to happen all at once, and soon after I got a nasty throat infection, my home server motherboard died. During installation of the motherboard one of the mirrored disks of the main file storage device failed. Time to make lemonade I guess.

A few lessons here:

  • Always have RAID-1 or RAID-5/RAID-Z, even for workstations. In this case, no priceless family photos or videos were lost. For workstations, you don't lose any time from work, and can grab a replacement disk later.
  • Software RAID is flexible for commodity hardware which often does not have 1 to 1 replacements at the shop a year or so after you bought it. You can usually just connect the old drives to a new motherboard, controller or another PC and it will just work. For desktop users, Fedora Linux you can do it via GUI during installation. Hopefully Ubuntu will have it too, as I think it's a good thing if it's easy for home users.
  • The RAID-1 of most motherboards works as it should, and you can disable the RAID setting and the drive(s) will still be easily accessible as a normal drive. As per the previous point, software RAID is recommended.

Time for ZFS

http://kaeru.my/journal/images/zfs-man.jpg

The two failures, conspired to forcing this upgraded setup earlier than anticipated. FreeBSD 7.1 had problems booting up on the MSI KA70VM as a PATA drive, forcing me to do a FreeBSD 8.0 binary upgrade from CD (totally trouble free I might add). Current best bang for the buck drives are 1TB and it's painful with UFS2. With ZFS production ready on 8.0, it's time for a modern storage layout.

ZFS Man (YouTube) is a funny and informative introduction to ZFS on FreeBSD.

These resources will get you going:

Some more tips here:

RAID-Z or Mirror?

Constantin Gonzalez has written an informative blog on this.

Your options are more space for cheaper (more space/drive) in a more inflexible setup (RAID-Z) or less space, with a more flexible and faster performance mirror setup. With 6 SATA ports, and the Antec P182 case having a 4 + 2 drive cage case, it makes more sense on commodity hardware to have a mirror setup where data loss is more of a factor than space.

Here is my list on why mirror makes more sense for commodity hardware:

  • I don't need that much space. I don't have large media requirements for critical shared data. None-critical data can also sit safely on my mirrored workstation drives.
  • You need boot disks, which should be mirrored. Curently I'm using 2 x 80GB PATA drives, but this won't be feaseable in near future. So that leaves you with 4 SATA ports.
  • Another SATA port is taken up by your DVDR drive
  • So you're left with 3 slots. With this amount, it doesn't make sense to run RAID-Z for me. Especially more so with the option to have 3-way mirror and swapping up larger drives to seamlessly upgrade your mirror. That makes sense on a household budget, where it's hard to justify buying 5 disks.
  • More drives = more heat and power usage = more noise.

Since commodity drives are likely to fail anyways, I grabbed a pair of the cheapest 1TB drives available which currently are the Samsung Spinpoint F1. Performance surprisingly was not bad for these drives.

Setting it up

This part blew me away.. ZFS rocks.

I find out that my two new drives are ad0 and ad1, with atacontrol list:

ATA channel 0:
    Master:  ad0 <SAMSUNG HD103UJ/1AA01118> SATA revision 2.x
    Slave:   ad1 <SAMSUNG HD103UJ/1AA01118> SATA revision 2.x
ATA channel 1:
    Master:  ad2 <ST380023A/3.33> ATA/ATAPI revision 6
    Slave:   ad3 <Maxtor 6L250R0/BAH41G10> ATA/ATAPI revision 7
ATA channel 2:
    Master:      no device present
    Slave:       no device present
ATA channel 3:
    Master:      no device present
    Slave:       no device present
ATA channel 4:
    Master:      no device present
    Slave:       no device present
ATA channel 5:
    Master: acd0 <PIONEER DVD-RW DVR-212/1.21> SATA revision 1.x
    Slave:       no device present

So let's create our mirror pool:

zpool create data mirror ad0 ad1

That's it, data is the pool name I used and it's automatically mounted at /data (no need to mess around with fstab and such).

Let's find out our new pool status:

[kaeru@xavier ~]$ zpool status
  pool: data
 state: ONLINE
 scrub: none requested
config:

    NAME        STATE     READ WRITE CKSUM
    data        ONLINE       0     0     0
      mirror    ONLINE       0     0     0
        ad0     ONLINE       0     0     0
        ad1     ONLINE       0     0     0

errors: No known data errors

And where it's mounted and how much space is available:

[kaeru@xavier ~]$ zfs list
NAME                  USED  AVAIL  REFER  MOUNTPOINT
data                  105G   808G    27K  /data
...

I've snipped some data here on some other mountpoints, hence some space is used already. This is immediately usable like any other filesytem.

Here is where some clarification is needed. The pool can act both as a device and filesystem. So by default data is the name of the pool and also the filesystem.

You can already copy files and such this /data filesystem, however everything in it will be treated as if its a single partition, so you can't do fancy stuff like set quotas, additional copies, compression and so on for subdirectories.

In order to do that, you need to create additional filesystems using the data pool:

zfs create data/jails
zfs set mountpoint=/jails data/jails

This is going to create a jails filesystem in the data pool, and automatically mount it as /jails. The mount command will show how this works:

mount
...

data/jails on /jails (zfs, local)
data on /data (zfs, NFS exported, local)

...

ls /data/jails is going to say no such file or directory, because there is no directory there. You could mkdir /data/jails if you wish but that's a directory but not the filesystem.

By default, without the mountpoint option, data/jails would have been automatically mounted as /data/jails. In the above example the difference between a filesystem and normal directory is clear. This difference is important when you export filesystems and wonder why /data is empty.

Automatic exporting of NFS/SMB shares

Exporting filesystems can now be done automatically using zfs commands:

zfs set sharenfs=on data/

This will export any "children" datasets (or filesystems) automatically like data/jails:

[kaeru@xavier ~]$ showmount -e
Exports list on localhost:
/data/videos/family                Everyone
/data/videos                       Everyone
/data/photos                       Everyone
/data/music                        Everyone
/data                              Everyone

You can set better security options of course. Back to the filesystems vs directory. If you NFS mount /data on a remote PC, you won't see /data/music or /data/photos. This is because they're not mounted in the /data filesystem(as a directory). If you want them available as /data/music on the client you'll have to mount them again, maybe as an nullfs mount on the server or as additional mounts on the client. Hierarchy here applies to datasets, not subdirectories, which work as normal POSIX filesystem. This should not be an issue in future with NFSv4 namespace support.

You can use old way of configuring /etc/exports if you want, but I like this way better, it makes sense.

Quotas

Similarly, no need to mess around with quotas anymore in fstab. One of the reasons for having jails dirs on MD disks, is a hard filesystem quota. With ZFS pools this is now no longer an issue:

xavier# zfs set quota=100GB data/jails
xavier# zfs list
NAME                  USED  AVAIL  REFER  MOUNTPOINT
data                 97.1G   816G    27K  /data
data/jails           1.80G  98.2G    19K  /jails
data/jails/kaeru.my  1.80G  98.2G  1.80G  /jails/kaeru.my
data/music           55.6G   816G  55.6G  /data/music
data/photos          21.4G   816G  21.4G  /data/photos
data/videos          18.3G   816G    19K  /data/videos
data/videos/family   18.3G   816G  18.3G  /data/videos/family

data/jails filesystem is now limited to 100GB, and now we want to limit kaeru.my jail to 20GB:

xavier# zfs quota=20GB data/jails/kaeru.my
xavier# zfs list
NAME                  USED  AVAIL  REFER  MOUNTPOINT
data                 98.8G   815G    27K  /data
data/jails           1.80G  98.2G    19K  /jails
data/jails/kaeru.my  1.80G  18.2G  1.80G  /jails/kaeru.my

kaeru.my jail is now limited to 20GB, whereas before it inherited jails limit of 100GB. Neat huh? Oh it's no longer UFS2 or and file backed MD disk.. no more long bgfsck's on unexpected reboots, no more double overhead of an MD file backed disk for performance.

There is a long list of other ZFS features, of which snapshots and the ability to send snapshots over pipes and ssh look the most interesting.

Some tuning needed

ZFS by default tends to eat up a lot of memory, and this can result in poor performance. After reboot, r/w performance was reduced to around 5-10MB/s after several minutes of use. I had to reduce the ZFS adaptive replacement cache (ARC) usage, to 512MB on my 4GB server.

In /boot/loader.conf:

vfs.zfs.arc_max="512M"

After this change, performance was closer to the limit of the drives and stayed there.

FreeBSD 8.0 Errata

FreeBSD 8 has a ton of new features, which will take a long time to explore. The good thing is that the performance features are immediately available such as the new scheduler. Here are some of the errata:

  • Dummynet used for bandwidth shaping seems to have some bugs, but patches are available: http://www.mail-archive.com/freebsd-ipfw@freebsd.org/msg02261.html especially the "dummynet: OUCH! pipe should have been idle!" messages.
  • Wifi setup has changed a bit, you need to setup wlan pseudo devices now.
  • jails has new functions, and command options including multiple ip's per jail, ipv6 and jails within jails and network stack virtualization.
Document Actions