Kaeru's Online Journal
Online journal entries sorted by date
Network media status and settings
Most of us now work with relatively large amounts of data, whether it be media or data. I've been on a Gigabit Ethernet switch now for a few years, because transferring data or virtual machine images of several gigabytes over the network is painfully slow at 100Mb/s (12.5MB/s max). If you see this limit when transferring files with GigE equipment and Cat5e/6 cables, chances are auto-negotiation is setting a conservative limit.
One usually thinks of wired connections as relatively plug and play, and that's true for the most part. Unfortunately, I found out recently, that at least on my Ubuntu Linux workstation, with cheap networking equipment such the RealTeks, the Lantecs and what not that you have at home, the defaults may set your media speed to 100Mb/s (Fast Ethernet) and not 1000Mb/s (Gigabit Ethernet).
These days you do not need to look at blinking lights to see if stuff is connected (usually).
Checking and setting Ethernet media status on Linux
sudo ethtool eth0 (or your ethernet device):
Settings for eth0:
Supported ports: [ TP MII ]
Supported link modes: 10baseT/Half 10baseT/Full
100baseT/Half 100baseT/Full
1000baseT/Half 1000baseT/Full
Supports auto-negotiation: Yes
Advertised link modes: 10baseT/Half 10baseT/Full
100baseT/Half 100baseT/Full
1000baseT/Half 1000baseT/Full
Advertised auto-negotiation: Yes
Speed: 1000Mb/s
Duplex: Full
Port: MII
PHYAD: 0
Transceiver: internal
Auto-negotiation: on
Supports Wake-on: pumbg
Wake-on: g
Current message level: 0x00000033 (51)
Link detected: yes
You'll notice that Speed here is at 1000Mb/s. Initially it was at 100Mb/s by default on mine.
Setting it is rather straight forward, with speed defined in Mb/s:
sudo ethtool -s eth0 speed 1000
The man page for ethtool is actually friendly with examples, something that often isn't the case in Linux.
You probably want to set this as default on startup, in something like rc.local.
Checking and setting Ethernet media status on FreeBSD
ifconfig command on FreeBSD generally provides all this info for you:
re1: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 7200
options=389b<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,WOL_UCAST,WOL_MCAST,WOL_MAGIC>
ether 00:13:f7:3a:80:f3
inet 192.168.1.1 netmask 0xffffff00 broadcast 192.168.1.255
inet 10.1.1.1 netmask 0xffffff00 broadcast 10.1.1.255
inet 10.1.1.2 netmask 0xffffffff broadcast 10.1.1.2
inet 10.1.1.3 netmask 0xffffffff broadcast 10.1.1.3
inet 10.1.1.4 netmask 0xffffffff broadcast 10.1.1.4
inet 10.1.1.5 netmask 0xffffffff broadcast 10.1.1.5
inet 10.1.1.6 netmask 0xffffffff broadcast 10.1.1.6
inet 10.1.1.7 netmask 0xffffffff broadcast 10.1.1.7
media: Ethernet autoselect (1000baseT <full-duplex>)
status: active
No problems here, 1000baseT as default.
A bit of tuning and jumbo frames
More on jumbo frames (Wikipedia) and their benefits.
By default in this case I get an increase in speed from Fast Ethernet (~25MB/s), but you can tune things further. One tuning option is to enable jumbo frames. The default MTU is only 1500. Most of us at home are likely to be using some sort of RealTek card. Usually MTU of jumbo frames is 9000, but RealTek cards only support a max MTU of 7422. On Linux the max is 7200 and on FreeBSD 7422 for RealTek. So I set both at 7200.
Setting the MTU can be done graphically or via ifconfig on both operating systems.
Now I'm getting around 40MB/s which is about 330% increase in speed from initial default setting of 100Mb/s on Linux.
ZFS on FreeBSD and Benefits of Software RAID
This was an unplanned journal entry. I wasn't planning on an upgrade and update to my home server which runs on FreeBSD. Bad things seem to happen all at once, and soon after I got a nasty throat infection, my home server motherboard died. During installation of the motherboard one of the mirrored disks of the main file storage device failed. Time to make lemonade I guess.
A few lessons here:
- Always have RAID-1 or RAID-5/RAID-Z, even for workstations. In this case, no priceless family photos or videos were lost. For workstations, you don't lose any time from work, and can grab a replacement disk later.
- Software RAID is flexible for commodity hardware which often does not have 1 to 1 replacements at the shop a year or so after you bought it. You can usually just connect the old drives to a new motherboard, controller or another PC and it will just work. For desktop users, Fedora Linux you can do it via GUI during installation. Hopefully Ubuntu will have it too, as I think it's a good thing if it's easy for home users.
- The RAID-1 of most motherboards works as it should, and you can disable the RAID setting and the drive(s) will still be easily accessible as a normal drive. As per the previous point, software RAID is recommended.
Time for ZFS
The two failures, conspired to forcing this upgraded setup earlier than anticipated. FreeBSD 7.1 had problems booting up on the MSI KA70VM as a PATA drive, forcing me to do a FreeBSD 8.0 binary upgrade from CD (totally trouble free I might add). Current best bang for the buck drives are 1TB and it's painful with UFS2. With ZFS production ready on 8.0, it's time for a modern storage layout.
ZFS Man (YouTube) is a funny and informative introduction to ZFS on FreeBSD.
These resources will get you going:
- http://wiki.freebsd.org/ZFS
- http://wiki.freebsd.org/ZFSQuickStartGuide
- http://en.wikipedia.org/wiki/ZFS
Some more tips here:
RAID-Z or Mirror?
Constantin Gonzalez has written an informative blog on this.
Your options are more space for cheaper (more space/drive) in a more inflexible setup (RAID-Z) or less space, with a more flexible and faster performance mirror setup. With 6 SATA ports, and the Antec P182 case having a 4 + 2 drive cage case, it makes more sense on commodity hardware to have a mirror setup where data loss is more of a factor than space.
Here is my list on why mirror makes more sense for commodity hardware:
- I don't need that much space. I don't have large media requirements for critical shared data. None-critical data can also sit safely on my mirrored workstation drives.
- You need boot disks, which should be mirrored. Curently I'm using 2 x 80GB PATA drives, but this won't be feaseable in near future. So that leaves you with 4 SATA ports.
- Another SATA port is taken up by your DVDR drive
- So you're left with 3 slots. With this amount, it doesn't make sense to run RAID-Z for me. Especially more so with the option to have 3-way mirror and swapping up larger drives to seamlessly upgrade your mirror. That makes sense on a household budget, where it's hard to justify buying 5 disks.
- More drives = more heat and power usage = more noise.
Since commodity drives are likely to fail anyways, I grabbed a pair of the cheapest 1TB drives available which currently are the Samsung Spinpoint F1. Performance surprisingly was not bad for these drives.
Setting it up
This part blew me away.. ZFS rocks.
I find out that my two new drives are ad0 and ad1, with atacontrol list:
ATA channel 0:
Master: ad0 <SAMSUNG HD103UJ/1AA01118> SATA revision 2.x
Slave: ad1 <SAMSUNG HD103UJ/1AA01118> SATA revision 2.x
ATA channel 1:
Master: ad2 <ST380023A/3.33> ATA/ATAPI revision 6
Slave: ad3 <Maxtor 6L250R0/BAH41G10> ATA/ATAPI revision 7
ATA channel 2:
Master: no device present
Slave: no device present
ATA channel 3:
Master: no device present
Slave: no device present
ATA channel 4:
Master: no device present
Slave: no device present
ATA channel 5:
Master: acd0 <PIONEER DVD-RW DVR-212/1.21> SATA revision 1.x
Slave: no device present
So let's create our mirror pool:
zpool create data mirror ad0 ad1
That's it, data is the pool name I used and it's automatically mounted at /data (no need to mess around with fstab and such).
Let's find out our new pool status:
[kaeru@xavier ~]$ zpool status
pool: data
state: ONLINE
scrub: none requested
config:
NAME STATE READ WRITE CKSUM
data ONLINE 0 0 0
mirror ONLINE 0 0 0
ad0 ONLINE 0 0 0
ad1 ONLINE 0 0 0
errors: No known data errors
And where it's mounted and how much space is available:
[kaeru@xavier ~]$ zfs list NAME USED AVAIL REFER MOUNTPOINT data 105G 808G 27K /data ...
I've snipped some data here on some other mountpoints, hence some space is used already. This is immediately usable like any other filesytem.
Here is where some clarification is needed. The pool can act both as a device and filesystem. So by default data is the name of the pool and also the filesystem.
You can already copy files and such this /data filesystem, however everything in it will be treated as if its a single partition, so you can't do fancy stuff like set quotas, additional copies, compression and so on for subdirectories.
In order to do that, you need to create additional filesystems using the data pool:
zfs create data/jails zfs set mountpoint=/jails data/jails
This is going to create a jails filesystem in the data pool, and automatically mount it as /jails. The mount command will show how this works:
mount ... data/jails on /jails (zfs, local) data on /data (zfs, NFS exported, local) ...
ls /data/jails is going to say no such file or directory, because there is no directory there. You could mkdir /data/jails if you wish but that's a directory but not the filesystem.
By default, without the mountpoint option, data/jails would have been automatically mounted as /data/jails. In the above example the difference between a filesystem and normal directory is clear. This difference is important when you export filesystems and wonder why /data is empty.
Automatic exporting of NFS/SMB shares
Exporting filesystems can now be done automatically using zfs commands:
zfs set sharenfs=on data/
This will export any "children" datasets (or filesystems) automatically like data/jails:
[kaeru@xavier ~]$ showmount -e Exports list on localhost: /data/videos/family Everyone /data/videos Everyone /data/photos Everyone /data/music Everyone /data Everyone
You can set better security options of course. Back to the filesystems vs directory. If you NFS mount /data on a remote PC, you won't see /data/music or /data/photos. This is because they're not mounted in the /data filesystem(as a directory). If you want them available as /data/music on the client you'll have to mount them again, maybe as an nullfs mount on the server or as additional mounts on the client. Hierarchy here applies to datasets, not subdirectories, which work as normal POSIX filesystem. This should not be an issue in future with NFSv4 namespace support.
You can use old way of configuring /etc/exports if you want, but I like this way better, it makes sense.
Quotas
Similarly, no need to mess around with quotas anymore in fstab. One of the reasons for having jails dirs on MD disks, is a hard filesystem quota. With ZFS pools this is now no longer an issue:
xavier# zfs set quota=100GB data/jails xavier# zfs list NAME USED AVAIL REFER MOUNTPOINT data 97.1G 816G 27K /data data/jails 1.80G 98.2G 19K /jails data/jails/kaeru.my 1.80G 98.2G 1.80G /jails/kaeru.my data/music 55.6G 816G 55.6G /data/music data/photos 21.4G 816G 21.4G /data/photos data/videos 18.3G 816G 19K /data/videos data/videos/family 18.3G 816G 18.3G /data/videos/family
data/jails filesystem is now limited to 100GB, and now we want to limit kaeru.my jail to 20GB:
xavier# zfs quota=20GB data/jails/kaeru.my xavier# zfs list NAME USED AVAIL REFER MOUNTPOINT data 98.8G 815G 27K /data data/jails 1.80G 98.2G 19K /jails data/jails/kaeru.my 1.80G 18.2G 1.80G /jails/kaeru.my
kaeru.my jail is now limited to 20GB, whereas before it inherited jails limit of 100GB. Neat huh? Oh it's no longer UFS2 or and file backed MD disk.. no more long bgfsck's on unexpected reboots, no more double overhead of an MD file backed disk for performance.
There is a long list of other ZFS features, of which snapshots and the ability to send snapshots over pipes and ssh look the most interesting.
Some tuning needed
ZFS by default tends to eat up a lot of memory, and this can result in poor performance. After reboot, r/w performance was reduced to around 5-10MB/s after several minutes of use. I had to reduce the ZFS adaptive replacement cache (ARC) usage, to 512MB on my 4GB server.
In /boot/loader.conf:
vfs.zfs.arc_max="512M"
After this change, performance was closer to the limit of the drives and stayed there.
FreeBSD 8.0 Errata
FreeBSD 8 has a ton of new features, which will take a long time to explore. The good thing is that the performance features are immediately available such as the new scheduler. Here are some of the errata:
- Dummynet used for bandwidth shaping seems to have some bugs, but patches are available: http://www.mail-archive.com/freebsd-ipfw@freebsd.org/msg02261.html especially the "dummynet: OUCH! pipe should have been idle!" messages.
- Wifi setup has changed a bit, you need to setup wlan pseudo devices now.
- jails has new functions, and command options including multiple ip's per jail, ipv6 and jails within jails and network stack virtualization.

