Archive for the ‘Solaris’ Category

Setting up login on serial port on Solaris

Thursday, November 8th, 2007

1. Ensure that there is are no services defined using: pmadm -l (in order to
remove use pmadm -r -p <name1> -s <name2> where name1 && name2 are taken from
pmadm -l output ).

2. Place serial port parameters in /etc/ttydefs:
echo “conttymy:9600 hupcl:9600 hupcl::” >> /etc/ttydefs

3. Execute:
pmadm -a -p zsmon -s ttya -i root -fu -v `ttyadm -V` -m “`ttyadm -m ldterm,ttcompat -p ‘Enter_your_Hostname login: ‘ -S n -T vt100 -d /dev/term/a -l conttymy -s /usr/bin/login`”

4. Verify by using: pmadm -l

5. Try connecting from the other end(KVM/other server).

6. To allow root login from serial edit /etc/default/login file and comment out CONSOLE=/dev/console (allows root login).

rpc.metacld or rpc.metad: Permission denied on Solaris 10u4

Thursday, September 13th, 2007

If for some reason you got the “rpc.metad: <hostname>: Permission denied” error while creating multiowner metaset (eg. for Oracle RAC; metaset -s oradg -M …) edit the /etc/group and add root to sysadmin group. Next, try to retry the command.

This error is strange strange, because I haven’t got it while creating metasets on S10U3. Hm.

My engineering work…

Wednesday, August 15th, 2007

As of May I’m very busy architecting & implementing cluster for Java Enterprise Edition on comodity hardware (mainly x86_32 based) for my engineering work – to obtain BEng title. Our subject is:
“Web service based on scalable and highly available J2EE application cluster”. We have team consisting of 4 persons in which I’m responsible for all kind of systems/hardware scaling/clusters/load balancing/databases/networking/tunning everything :) . What kind of portal we are creating is to be decided by developers (it will likely be some kind of Web 2.0 portal).
Rest of the team is dedicated to J2EE programming. We are mainly playing with technology.
Currently rock-solid base core cluster architecture looks like this:

Cluster architecture

We are utilizing:

  • Load balancers: Linux Virtual Servers with DirectRouting on CentOS5 (configured as a part of Redhat Cluster Suite)
  • Database: Oracle10g R2
  • Middleware: JBOSS 4.2.0 (EJB3) running in a cluster based on JGroups + Hibernate(JPA) + JBOSS Cache
  • Frontend: Apache2 webservers with Solaris Network Cache Accelerator and AJP proxy to JBOSS servers
  • Solaris Jumpstart to setup new systems really fast with our selfwritten application in PHP for maintaing systems.
  • NFS for providing static content for web servers from Oracle server (yay! dedicated NetApp would be great! ;) )
  • LDAP to synchronize admins accounts inside cluster.
  • SNMPv2(LVS,OSes,JBOSS,Oracle) to monitor everything with single (selfwritten) Java application which graphs everything in realtime.

As this is basic configuration with database as an single point of failure, in Septemer I’m going to setup DataGuard for Oracle. Also I’m testing more advanced scale up. Currently I’m in process of setting up Solaris Cluster with Oracle RAC 10gR2 implemented on iSCSI storage provided by third node based on Solaris Nevada with iSCSI target to test Transparent Application Failover. I’ve been scratching my head over this one for awhile now. Yeah, it is real hardcore… more over that’s not the end of the story – Disaster Recovery with some other interesting bits of technology is going to be implemented later on… all on x86_32 comodity hardware :) Also we are going to put C-JDBC(Sequoia project) under stress…

Solaris x86 customized Jumpstart from Linux NFS server — NFSv4 problem and solution

Friday, July 6th, 2007

There is some kind of incompatibility between Linux 2.6 NFSv4 server nad Solaris 10 (U3) NFSv4 client. On installed Solaris you can put some variables into /etc/default/nfs and it should work, but when you are trying to bootstrap from Linux NFS server using Jumpstart you have to search for another solution:

1) Build a new miniroot image with /etc/default/nfs altered?
2) Simpler… alter Linux NFS server to provide only eg. only NFSv2 service
This can be achieved by recompiling kernel without NFSv4 or by much more cleaner solution – disabling NFSv4 services on runtime.

Place the following in /etc/sysconfig/nfs (RHEL5/CentOS5 specific configuration file):
RPCMOUNTDOPTS="--no-tcp --no-nfs-version 4 --no-nfs-version 3"
RPCNFSDARGS="--no-tcp --no-nfs-version 4 --no-nfs-version 3"

Now execute
/etc/init.d/nfs restart
That’s all! :) Jumpstart problem solved!

For more info consider reading man pages for rpc.nfsd and rpc.mountd. Internally those switches write “+2 -3 -4″ to /proc/fs/nfsd/versions. Versions file can be only modified after stopping [nfsd] kernel service ( you’ll get EBUSY errno while trying to change it with nfsd lanuched ).

Performance comparison: Apache2 on Nevada65 with and without (kernel-based) Network Cache Accelerator

Wednesday, June 27th, 2007

Apache2.2.3 is running on Solaris Nevada build 65 on x86 (VMware emulated), has 0.5GB of RAM, and ~ 660MB of data in apache document root.

This web data is divided into 10000 1kbyte files and ~650 1Mbyte files. Logging during test was disabled for NCA and also for Apache (CommonLog commented out).

Load is generated by http_load downloaded from acme.com, running on VM hosting (same) machine – Linux based, url file is build by this script:

#!/bin/sh
c=1
yes=0
rm -f url
while [ $c -lt 10000 ]; do
echo http://192.168.77.22/k$c >> url
yes=$[$yes+1]
yes=`expr $yes % 10`
if [[ $yes == 0 ]]; then
x=`expr $RANDOM % 649`
x=$[$x+1]
echo http://192.168.77.22/m${x} >> url
fi #echo $c
c=$[$c+1]
done

Tripple run of http_load on apache2 without NCA:


vnull@xeno:~/inz/instalki/http_load-12mar2006$ !./http
./http_load -parallel 5 -seconds 10 url
947 fetches, 5 max parallel, 9.52494e+07 bytes, in 10 seconds
100580 mean bytes/connection
94.7 fetches/sec, 9.52494e+06 bytes/sec
msecs/connect: 0.811796 mean, 22 max, 0.248 min
msecs/first-response: 2.89326 mean, 142.79 max, 0.562 min
HTTP response codes:
code 200 — 947

vnull@xeno:~/inz/instalki/http_load-12mar2006$ ./http_load -parallel 5 -seconds10 url
1039 fetches, 5 max parallel, 9.84863e+07 bytes, in 10 seconds
94789.5 mean bytes/connection
103.9 fetches/sec, 9.84862e+06 bytes/sec
msecs/connect: 0.706046 mean, 23.07 max, 0.247 min
msecs/first-response: 3.09328 mean, 258.584 max, 0.564 min
HTTP response codes:
code 200 — 1039

vnull@xeno:~/inz/instalki/http_load-12mar2006$ ./http_load -parallel 5 -seconds10 url
959 fetches, 5 max parallel, 1.01547e+08 bytes, in 10 seconds
105888 mean bytes/connection
95.9 fetches/sec, 1.01547e+07 bytes/sec
msecs/connect: 0.824296 mean, 42.473 max, 0.248 min
msecs/first-response: 2.81876 mean, 231.154 max, 0.528 min
HTTP response codes:
code 200 — 959

Tripple run of http_load against apache2 WITH NCA:

vnull@xeno:~/inz/instalki/http_load-12mar2006$ ./http_load -parallel 5 -seconds10 url
1353 fetches, 5 max parallel, 1.41757e+08 bytes, in 10.0018 seconds
104773 mean bytes/connection
135.276 fetches/sec, 1.41732e+07 bytes/sec
msecs/connect: 4.6854 mean, 77.009 max, 0.399 min
msecs/first-response: 8.05471 mean, 77.009 max, 1.403 min
HTTP response codes:
code 200 — 1353

vnull@xeno:~/inz/instalki/http_load-12mar2006$ ./http_load -parallel 5 -seconds10 url
1494 fetches, 5 max parallel, 1.48097e+08 bytes, in 10.0006 seconds
99127.9 mean bytes/connection
149.391 fetches/sec, 1.48088e+07 bytes/sec
msecs/connect: 4.26223 mean, 57.807 max, 0.398 min
msecs/first-response: 7.23063 mean, 57.807 max, 0.583 min
HTTP response codes:
code 200 — 1494

vnull@xeno:~/inz/instalki/http_load-12mar2006$ ./http_load -parallel 5 -seconds10 url
1568 fetches, 5 max parallel, 1.51315e+08 bytes, in 10.0029 seconds
96502.2 mean bytes/connection
156.755 fetches/sec, 1.51272e+07 bytes/sec
msecs/connect: 3.97283 mean, 207.755 max, 0.398 min
msecs/first-response: 6.71551 mean, 207.755 max, 0.586 min
HTTP response codes:
code 200 — 1568

NCA: (157+149+135)/3 = ~147 r/s
without NCA: (95+104+96)/3 = ~98 r/s

NCA gave ~50% boost for free in this test scenario!

Note that /dev/nca’s nca_max_cache_size was set to 2048 to cache files only up to 2kB - using ndd.

Solaris Network Cache & Accelerator – does it work or not .. ?

Wednesday, June 27th, 2007

Determining whether Network Cache Accelerator works on Solaris…

First ensure that you have enabled everything you need in /etc/nca/*

Check apache2 that it is NOT listening IPV6!
Simple test:
-bash-3.00# grep ^Listen /etc/apache2/httpd.conf
Listen 192.168.77.22:80

Should be enough

Next place the following into /usr/apache2/bin/apachectl just before the end of the configuration section:

# Enable NCA:
NCAKMODCONF=/etc/nca/ncakmod.conf
if [ -f $NCAKMODCONF ]; then
. $NCAKMODCONF
if [ "x$status" = "xenabled" ]; then
HTTPD=”env LD_PRELOAD=/usr/lib/ncad_addr.so $HTTPD”
fi
fi

Reboot – yay, this shouldn’t happen on a UNIX box ;)

ncad_addr.so should be preloaded by LD_PRELOAD:
-bash-3.00# pldd `pgrep http`|grep ncad
/usr/lib/ncad_addr.so.1
/usr/lib/ncad_addr.so.1
/usr/lib/ncad_addr.so.1
/usr/lib/ncad_addr.so.1
/usr/lib/ncad_addr.so.1
/usr/lib/ncad_addr.so.1

Also you can check by pargs -e if LD_PRELOAD is set properly.

Hardcore way of determining whether NCA works:
-bash-3.00# truss -ff -t accept,listen,bind /usr/apache2/bin/apachectl start
703: bind(256, 0x08047C90, 16, SOV_SOCKBSD) = 0
703: listen(256, 8192, SOV_DEFAULT) = 0
707: accept(256, 0x081A5268, 0x081A5254, SOV_DEFAULT) (sleeping...)
709: accept(256, 0x081A5268, 0x081A5254, SOV_DEFAULT) (sleeping...)
711: accept(256, 0x081A5268, 0x081A5254, SOV_DEFAULT) (sleeping...)
713: accept(256, 0x081A5268, 0x081A5254, SOV_DEFAULT) (sleeping...)
715: accept(256, 0x081A5268, 0x081A5254, SOV_DEFAULT) (sleeping...)
<run eg. GET http://<ip>/123.html>
713: accept(256, 0x081A5268, 0x081A5254, SOV_DEFAULT) = 11
713: accept(256, 0x081A5268, 0x081A5254, SOV_DEFAULT) (sleeping...)
<run second time, no output should happen from accept() as request is served by kernel!>
^C
-bash-3.00#

Notes:
pfiles `pgrep http`|grep AF will show you that listening socket is AF_INET type, not AF_NCA! This is odd!

It seems that on Solaris Nevada truss -v is broken (doesn’t display parameters in details?? On Solaris 10 it works..

Also it seems that sotruss -f -T ncad_addr.so output differs from Solaris 10 to Nevada – smells like second bug? It seems it doesn’t show calls to bind() from ncad_addr.so ?

Securing OpenLDAP – userPassword issue

Tuesday, June 26th, 2007

Unsecured OpenLDAP (slapd) server…

Output from Solaris 10 box:

-bash-3.00# ldaplist -l passwd test5
dn: uid=test5,ou=People,dc=lab1
uid: test5
cn: Johnny Doe
[..]
homeDirectory: /export/home/test5
userPassword: {MD5}DMF1ucDxtqgxw5niaXcmYQ==

After adding following snippet to OpenLDAP’s slapd.conf file we are preventing anyone from viewing user password(including Solaris LDAP proxy bind, excluding logging in user and admin/Manager of slapd):

access to attrs=userPassword,shadowLastChange
by dn="cn=admin,dc=lab1" write
by anonymous auth
by self write
by * read


-bash-3.00# ldaplist -l passwd test5
dn: uid=test5,ou=People,dc=lab1
uid: test5
cn: Johnny Doe
[..]
gecos: Johnny Doe,none,0,1,Johnny Doe
homeDirectory: /export/home/test5
-bash-3.00#

Playing with long distance NFS replication/Disaster Recovery protection using Sun StorageTek Availability Suite

Thursday, June 7th, 2007

“Sun StorageTek Availability Suite, or AVS for short, is an OpenSolaris Community project that provides two filter drivers; Remote Mirror Copy & Point in Time Copy, a filter-driver framework, and an extensive collection of supporting software and utilities.

The Remote Mirror Copy and Point in Time Copy software allows volumes and/or their snapshots, to be replicated between physically separated servers in real time, or by point-in-time, over virtually unlimited distances. Replicated volumes can be used for tape and disk backup, off-host data processing, disaster recovery solutions, content distribution, and numerous other volume based processing tasks.”

Today I configured the following scenario:

DR for NFS servers

Both avs1 and avs2 nodes are running OpenSolaris Nevada build 65. It works great! Each of the nodes is running ZFS mirror on two disks. Also the AVS bitmaps should be RAID protected (for example using SVM). After the DR switch:

AVS was commercial project, but Sun decided to release it for free as a Open Source project, so enjoy!

Everyone likes screenshots, here you have one from initial sync:

sun_avs_replication.png

links for 05/06/07

Tuesday, June 5th, 2007

J2EE performance tips

Some J2EE Performance Tips

Sun StorageTek Availability Suite 4.0 Software Installation and Configuration Guide

Sun StorageTek Availability Suite 4.0 Remote Mirror Software Administration Guide

Sun Cluster and Sun StorageTek Availability Suite 4.0 Software Integration Guide

… evolution doesn’t take prisoners …

Solaris 10 as a LDAP client of OpenLDAP (slapd)

Sunday, May 27th, 2007

It took me almost three hours to learn basics of LDAP and understand why native Solaris LDAP client doesn’t work with OpenLDAP slapd service…

Good links to start with:
Solaris LDAP client with OpenLDAP server

Solaris 8 OpenLDAP: Configuring

Some screenshots:
GQ LDAP schema dc=lab1

GQ schema view on “Kowalski” username

GQ schema view of “Solaris” profile used by ldapclient(1M) to configure LDAP on solaris OS

Output of ldapclient on solaris box after configuration

(Sun) Solaris Cluster 3.2 on x86-32bit … on VMware – screenshot

Monday, May 21st, 2007

booting_vulcan1_32bit.png

The BALROG Solaris LKM

Saturday, April 7th, 2007

Some time ago I’ve written proof-of-concept Solaris loadable kernel module to demonstrate sending packets from kernel space. You can see proof-of-concept MPEG movie here. Similar modules have been floating on the net for Linux for years, but there wasn’t any for Solaris. The plan was to write backdooring LKM with networking abilities possibly with some advanced hiding features like controling Balrog from DNS server – Balrog had to simulate DNS client making requests to /etc/resolv.conf’s proxy DNS servers ( the idea was to fool firewall/IDS/IPS systems which allow DNS traffic from servers ). Due to lack of time I had to abort the project - only bits of code responsible for sending and reciving have been written, even without in-kernel DNS library. On the movie you can see sending data on UDP port 53 after module initialisation. It was real hackery to get things done simply because orginal Solaris 10 kernel didn’t have API for accessing kernel-side of sockets ( fortunately source code from OpenSolaris helped me a lot ;] )… The resuling C code of Balrog is so ugly that I’m not going even to release it, however today I’ve noticed new OpenSolaris project named kernel-sockets so maybe it’s time for a small rewrite ? :)

Exporting simple file from Linux target to Solaris initiatior using iSCSI

Friday, March 23rd, 2007

QuickHowTo about ”exporting” via iSCSI simple file from Linux target (ietd) to Solaris OS:

Linux target is running Debian/4.0, 2.6.18 kernel and iSCSI target version 0.4.14 – I wish it was Solaris box, but my very old home SCSI controllers aren’t supported by Solaris ( DELL MegaRAID 428 – PERC2 and InitIO ) – however there are some drivers but for Solaris 2.7-2.8, but after small war with them I must say that I failed…. even after playing hardcore stuff in /etc/driver_aliases

Installing iSCSI target on Debian is discussed here: Unofficial ISCSI target installation. Some checks:

rac3:/etc/init.d# cat /proc/net/iet/volume
tid:2 name:iqn.2001-04.com.example:storage.disk2.sys1.xyz
lun:0 state:0 iotype:fileio iomode:wt path:/u01/iscsi.target

rac3:/etc/init.d# cat /proc/net/iet/session
tid:2 name:iqn.2001-04.com.example:storage.disk2.sys1.xyz

As you can see /u01/iscsi.target is normal file ( created with dd(1) ) on MegaRAID RAID0 array. We will use it to do some testing from Solaris:


root@opensol:~# iscsiadm add static-config iqn.2001-04.com.example:storage.disk2.sys1.xyz,10.99.1.25
root@opensol:~# iscsiadm modify discovery --static enable
root@opensol:~# devfsadm -i iscsi
root@opensol:~# iscsiadm list target
Target: iqn.2001-04.com.example:storage.disk2.sys1.xyz
Alias: -
TPGT: 1
ISID: 4000002a0000
Connections: 1
root@opensol:~# format
Searching for disks...done

0. c1t0d0
/pci@0,0/pci1000,30@10/sd@0,0

1. c2t17d0
/iscsi/disk@0000iqn.200104.com.example%3Astorage.disk2.sys1.xyzFFFF,0
Specify disk (enter its number): CTRL+C


Okay so we are now sure that iSCSI works. In several days i'm going to test exporting SONY SDT-9000 ( an old tape drive ) via iSCSI :)

Testy IPMP/Solaris 10

Sunday, March 18th, 2007

IPMP – IP MultiPathing to technika HA ( High Avability ) pozwalajaca utrzymac lacznosc ze swiatem serwera z systemem Solaris mimo awarii jednego lub wiecej polaczen sieciowych. To tyle tytulem wprowadzenia :)

xeno – komputer podlaczony do switcha pojedyncza karta sieciowa
10.99.1.20 – Solaris10 z IPMP ( 2 karty sieciowe do tego samego switcha co xeno )

xeno:~# ping -i 0.5 10.99.1.20
PING 10.99.1.20 (10.99.1.20) 56(84) bytes of data.
64 bytes from 10.99.1.20: icmp_seq=1 ttl=255 time=0.318 ms
64 bytes from 10.99.1.20: icmp_seq=2 ttl=255 time=0.308 ms
64 bytes from 10.99.1.20: icmp_seq=3 ttl=255 time=0.293 ms
64 bytes from 10.99.1.20: icmp_seq=4 ttl=255 time=0.325 ms
64 bytes from 10.99.1.20: icmp_seq=5 ttl=255 time=0.312 ms
64 bytes from 10.99.1.20: icmp_seq=6 ttl=255 time=0.308 ms
64 bytes from 10.99.1.20: icmp_seq=7 ttl=255 time=0.342 ms
64 bytes from 10.99.1.20: icmp_seq=8 ttl=255 time=0.325 ms
# wyciagniecie kabelka...
64 bytes from 10.99.1.20: icmp_seq=19 ttl=255 time=0.267 ms
64 bytes from 10.99.1.20: icmp_seq=20 ttl=255 time=0.271 ms
64 bytes from 10.99.1.20: icmp_seq=21 ttl=255 time=0.338 ms
64 bytes from 10.99.1.20: icmp_seq=22 ttl=255 time=0.280 ms
64 bytes from 10.99.1.20: icmp_seq=23 ttl=255 time=0.254 ms

Czyli 19-8 pakietow zgubionych, co przeklada sie na (19-8)*0.5=5.5 sekund niedostepnosci maszyny przy ponizszym wpisie do /etc/default/mpathd:
# 2s
FAILURE_DETECTION_TIME=2000

A teraz kilka nieudokumentowanych flag dla in.mpathd:

  • -a “adopt” mode
  • -f force multicast
  • -d debug/foreground
  • -D debug flags
  • -l Turn off link state notification handling. ( debug )

Quagga vtysh

Sunday, March 18th, 2007

Post z dnia: 01/03/2007

Gdyby ktos nie wiedzial to od wersji 0.99.5 wlacznie mozna wykonac duzo polecen Quaggi w pojedynczym poleceniu shella – co jest calkiem fajne:

ZZZ:/etc# VTYSH_PAGER=/bin/cat vtysh -c 'show ip rip status' -c 'conf t' -c 'router rip' -c 'passive-interface eth0'
Routing Protocol is "rip"
Sending updates every 30 seconds with +/-50%, next due in -1173105357 seconds
Timeout after 180 seconds, garbage collect after 120 seconds
Outgoing update filter list for all interface is rip_lan
Incoming update filter list for all interface is rip_lan
Default redistribution metric is 1
[etc...]

Albo na szybko zapamietanie konfiguracji ( z uzywaniem plik per protokol routingu, standardowo vtysh chce zapamietywac do jednego wspolnego – Quagga.conf ):

ZZZ:/etc# VTYSH_PAGER=/bin/cat /usr/bin/vtysh -c 'conf t' -c 'no service integrated-tysh-config' -c 'end' -c 'write'
Building Configuration...
Configuration saved to /etc/quagga/zebra.conf
Configuration saved to /etc/quagga/ripd.conf
Configuration saved to /etc/quagga/ospfd.conf
[OK]
ZZZ:/etc#