Discussion:
Bug#648218: firebird2.5-classic: fb_inet_server segfaults several time a day
(too old to reply)
Robert Vojta
2011-11-09 17:20:02 UTC
Permalink
Package: firebird2.5-classic
Version: 2.5.0.26054~ReleaseCandidate3.ds2-1+b1
Severity: important
Tags: squeeze

Firebird, namely fb_inet_server, segfaults several times per day. I tried
-super, -superclassic too, but the classic one is crashing less frequently
than others.

I'm running it on fully updated Squeeze on a Dell AMD64 server. Squeeze is
32-bit. Everything else except Firebird do work like a charm.

More connections leads to more crashes. In other words, when I do use just
one client, it crashes several times per day. But when I do use two or more
clients, it's dead in 30 minutes and I can't no longer use Firebird - can't
connect.

I have these errors in /var/log/firebird2.5.log

- INET/inet_error: read errno = 104

Sometimes when I'm killing fb_inet_server processes, I have these messages
in firebird log ...

- Firebird shutdown is still in progress after the specified timeout
- Operating system call pthread_mutex_destroy failed. Error code 16

And system log does contain these messages ...

[10062.432824] fb_inet_server[2246]: segfault at b1d4ca88 ip b1d4ca88 sp b606734c error 4
[20649.227405] fb_inet_server[2537]: segfault at b1cffa88 ip b1cffa88 sp b601a34c error 4 in gconv-modules.cache[b1dbd000+7000]
[21339.007780] fb_inet_server[2554]: segfault at b1db6a88 ip b1db6a88 sp b60d134c error 4
[22253.359491] fb_inet_server[2574]: segfault at b1dd9a88 ip b1dd9a88 sp b60d334c error 4 in gconv-modules.cache[b1e76000+7000]
[25012.220851] fb_inet_server[2606]: segfault at b1e63a88 ip b1e63a88 sp b615d34c error 4 in gconv-modules.cache[b1f00000+7000]
[26228.282421] fb_inet_server[2637]: segfault at b1ecea88 ip b1ecea88 sp b61c834c error 4 in gconv-modules.cache[b1f6b000+7000]
[31144.249003] fb_inet_server[2733]: segfault at b1ce9a88 ip b1ce9a88 sp b5fd334c error 4 in gconv-modules.cache[b1d76000+7000]
[31341.054761] fb_inet_server[2755]: segfault at b1da6a88 ip b1da6a88 sp b609034c error 4 in gconv-modules.cache[b1e33000+7000]
[31359.635212] fb_inet_server[2676]: segfault at b1cf6a88 ip b1cf6a88 sp b5ff034c error 4 in gconv-modules.cache[b1d93000+7000]
[31380.681084] fb_inet_server[2769]: segfault at b1ed1a88 ip b1ed1a88 sp b61cb34c error 4
[34686.255276] fb_inet_server[2825]: segfault at b1e40a88 ip b1e40a88 sp b612a34c error 4 in gconv-modules.cache[b1ecd000+7000]
[36906.366343] fb_inet_server[2781]: segfault at b1eb4a88 ip b1eb4a88 sp b61ae34c error 4 in gconv-modules.cache[b1f51000+7000]
[39485.786601] fb_inet_server[2911]: segfault at b1d67a88 ip b1d67a88 sp b606134c error 4 in gconv-modules.cache[b1e04000+7000]
[39636.909913] fb_inet_server[2921]: segfault at b1e9ca88 ip b1e9ca88 sp b619634c error 4 in gconv-modules.cache[b1f39000+7000]
[39666.526356] fb_inet_server[2928]: segfault at b1cfca88 ip b1cfca88 sp b601734c error 4
[47592.753573] fb_inet_server[3070]: segfault at b1ceca88 ip b1ceca88 sp b600734c error 4
[48050.436549] fb_inet_server[2972]: segfault at b1d75a88 ip b1d75a88 sp b606f34c error 4 in gconv-modules.cache[b1e12000+7000]
[56475.898936] fb_inet_server[3324]: segfault at b1da3a88 ip b1da3a88 sp b60be34c error 4 in gconv-modules.cache[b1e61000+7000]
[59000.029004] fb_inet_server[3436]: segfault at b1d22a88 ip b1d22a88 sp b454734c error 4

.... no idea what can be wrong. Willing to test anything. It's on production
server and we have live data in this database.

-- System Information:
Debian Release: 6.0.3
APT prefers stable-updates
APT policy: (500, 'stable-updates'), (500, 'stable')
Architecture: i386 (i686)

Kernel: Linux 2.6.32-5-686 (SMP w/4 CPU cores)
Locale: LANG=en_US.UTF-8, LC_CTYPE=en_US.UTF-8 (charmap=UTF-8)
Shell: /bin/sh linked to /bin/dash

Versions of packages firebird2.5-classic depends on:
ii d 1.5.36.1 Debian configuration management sy
ii f 2.5.0.26054~ReleaseCandidate3.ds2-1+b1 common files for firebird 2.5 "cla
ii f 2.5.0.26054~ReleaseCandidate3.ds2-1+b1 common files for firebird 2.5 serv
ii f 2.5.0.26054~ReleaseCandidate3.ds2-1 copyright, licnesing and changelog
ii f 2.5.0.26054~ReleaseCandidate3.ds2-1+b1 common files for firebird 2.5 serv
ii l 2.11.2-10 Embedded GNU C Library: Shared lib
ii l 2.5.0.26054~ReleaseCandidate3.ds2-1+b1 Firebird embedded client/server li
ii l 1:4.4.5-8 GCC support library
ii l 4.4.5-8 The GNU Standard C++ Library v3
ii n 4.45 Basic TCP/IP networking system
ii o 0.20080125-6 The OpenBSD Internet Superserver

Versions of packages firebird2.5-classic recommends:
ii l 2.5.0.26054~ReleaseCandidate3.ds2-1+b1 Firebird UDF support library

Versions of packages firebird2.5-classic suggests:
pn firebird2.5-doc <none> (no description available)

-- debconf information:
shared/firebird/sysdba_password/new_password: (password omitted)
shared/firebird/sysdba_password/upgrade_reconfigure: (password omitted)
* shared/firebird/sysdba_password/first_install: (password omitted)
shared/firebird/server_in_use:
shared/firebird/purge_databases: false
shared/firebird/title:
* shared/firebird/enabled: true
shared/firebird/purge_security: false
--
To UNSUBSCRIBE, email to debian-bugs-dist-***@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact ***@lists.debian.org
Damyan Ivanov
2011-11-09 20:00:02 UTC
Permalink
-=| Robert Vojta, 09.11.2011 18:02:08 +0100 |=-
Post by Robert Vojta
Package: firebird2.5-classic
Version: 2.5.0.26054~ReleaseCandidate3.ds2-1+b1
Severity: important
Tags: squeeze
Firebird, namely fb_inet_server, segfaults several times per day. I tried
-super, -superclassic too, but the classic one is crashing less frequently
than others.
More connections leads to more crashes. In other words, when I do
use just one client, it crashes several times per day. But when I do
use two or more clients, it's dead in 30 minutes and I can't no
longer use Firebird - can't connect.
Even to the classic server? That's strange since fb_inet_server is run
from inetd/xinetd and if you can't connect there is something terribly
wrong somewhere.
Post by Robert Vojta
I have these errors in /var/log/firebird2.5.log
- INET/inet_error: read errno = 104
Sometimes when I'm killing fb_inet_server processes, I have these messages
in firebird log ...
I wonder why killing is necessary.
Post by Robert Vojta
- Firebird shutdown is still in progress after the specified
timeout
- Operating system call pthread_mutex_destroy failed. Error code 16
And system log does contain these messages ...
[10062.432824] fb_inet_server[2246]: segfault at b1d4ca88 ip b1d4ca88 sp b606734c error 4
[20649.227405] fb_inet_server[2537]: segfault at b1cffa88 ip b1cffa88 sp b601a34c error 4 in gconv-modules.cache[b1dbd000+7000]
I have similar segfault messages with 2.1.3.18185-0.ds1-11+b1 (64-bit
squeeze), but these be emitted when clients disconnect from the server
and are harmless.

Is this the same in your case or does the process crash during its
work?
Chiefly Izzy
2011-11-09 20:20:02 UTC
Permalink
Post by Damyan Ivanov
Even to the classic server? That's strange since fb_inet_server is run
from inetd/xinetd and if you can't connect there is something terribly
wrong somewhere.
Sorry, I used wrong words here. I can telnet to port 3050, so, (x)inetd does work and it starts another fb_inet_server. But that's all, can't use this connection. The problem is that I have closed source client on Windows and it simply freezes and no longer works. When I kill this client, start it again, it doesn't work ...
Post by Damyan Ivanov
I wonder why killing is necessary.
... and it started to work again when I do this ...

/etc/init.d/openbsd-inetd stop
killall -9 fb_inet_server
/etc/init.d/openbsd-inetd start

... and after these steps, client app does work normally again. According to ps, when this happens, I have obviously +- 10 instances of fb_inet_server running. New instance of fb_inet_server is started after each segfault message.

When I tried -superclassic, -super, I have to kill it also to make it working.
Post by Damyan Ivanov
I have similar segfault messages with 2.1.3.18185-0.ds1-11+b1 (64-bit
squeeze), but these be emitted when clients disconnect from the server
and are harmless.
According to our client application vendor, connection is one and is keep alived.

When I tried to install Windows 2008 server, made same setup, it does work flawlessly without any issues. Same version of Firebird.
Post by Damyan Ivanov
Is this the same in your case or does the process crash during its
work?
Crash during work.

There shouldn't be problem in connection drops, etc. because both machines are on the same local network.

If you want me to test anything when this happens, let me know what, I can do whatever to get more info what's happening.
--
To UNSUBSCRIBE, email to debian-bugs-dist-***@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact ***@lists.debian.org
marius adrian popa
2011-11-09 20:20:03 UTC
Permalink
Post by Damyan Ivanov
-=| Robert Vojta, 09.11.2011 18:02:08 +0100 |=-
Post by Robert Vojta
Package: firebird2.5-classic
Version: 2.5.0.26054~ReleaseCandidate3.ds2-1+b1
Severity: important
Tags: squeeze
I also see that you are using ReleaseCandidate3
maybe we should prepare a backport
--
To UNSUBSCRIBE, email to debian-bugs-dist-***@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact ***@lists.debian.org
Damyan Ivanov
2011-11-10 20:10:01 UTC
Permalink
-=| marius adrian popa, 09.11.2011 22:10:21 +0200 |=-
Post by marius adrian popa
I also see that you are using ReleaseCandidate3
maybe we should prepare a backport
A backport won't fix the problem for other users of stable.

I'll look at the upstream changes in the 2.5 branch after the third
release candidate. Maybe there is a commit we can cherry-pick.
Damyan Ivanov
2011-11-10 21:40:03 UTC
Permalink
-=| Damyan Ivanov, 10.11.2011 22:01:55 +0200 |=-
Post by Damyan Ivanov
-=| marius adrian popa, 09.11.2011 22:10:21 +0200 |=-
Post by marius adrian popa
I also see that you are using ReleaseCandidate3
maybe we should prepare a backport
A backport won't fix the problem for other users of stable.
I'll look at the upstream changes in the 2.5 branch after the third
release candidate. Maybe there is a commit we can cherry-pick.
Hmm. There are about 350 potentially useful commits between 2.5.0 RC3
and 2.5.0 final. Lots of them claim to be fixing server crashes :(

Robert, is there an easy way to make the server crash? Having such
a procedure would help testing any potential fixes.
Chiefly Izzy
2011-11-11 12:10:02 UTC
Permalink
Post by Damyan Ivanov
Hmm. There are about 350 potentially useful commits between 2.5.0 RC3
and 2.5.0 final. Lots of them claim to be fixing server crashes :(
Robert, is there an easy way to make the server crash? Having such
a procedure would help testing any potential fixes.
Damyan, I can make the sever crash in an easy way - whenever I start more than 1 client application, it crashes always under 30 minutes. But this can be tested by me only, since this is closed source code, Windows only, and I can't share it and I can't share database data either. Yes, I have reproducibled cases, but I have no idea what's going under the hood. Probably not so helpful ... But I can test whatever you want anytime.
--
To UNSUBSCRIBE, email to debian-bugs-dist-***@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact ***@lists.debian.org
Damyan Ivanov
2011-11-13 12:00:02 UTC
Permalink
-=| Chiefly Izzy, 11.11.2011 12:57:12 +0100 |=-
Post by Chiefly Izzy
Post by Damyan Ivanov
Hmm. There are about 350 potentially useful commits between 2.5.0 RC3
and 2.5.0 final. Lots of them claim to be fixing server crashes :(
Robert, is there an easy way to make the server crash? Having such
a procedure would help testing any potential fixes.
Damyan, I can make the sever crash in an easy way - whenever I start
more than 1 client application, it crashes always under 30 minutes.
But this can be tested by me only, since this is closed source code,
Windows only, and I can't share it and I can't share database data
either. Yes, I have reproducibled cases, but I have no idea what's
going under the hood. Probably not so helpful ... But I can test
whatever you want anytime.
Okay, this is going to be hard :)

I have prepared packages from the current version in wheezy
(2.5.1.26351), built for squeeze.

ftp://ftp.modsoftsys.net/public/firebird2.5-backports/

If these improve the situation, then we at least know the bug is fixed
in that version and can start looking for the exact fix, which would
take about 10 iterations (and probably several weeks).
Chiefly Izzy
2011-11-14 19:20:01 UTC
Permalink
Post by Damyan Ivanov
I have prepared packages from the current version in wheezy
(2.5.1.26351), built for squeeze.
ftp://ftp.modsoftsys.net/public/firebird2.5-backports/
If these improve the situation, then we at least know the bug is fixed
in that version and can start looking for the exact fix, which would
take about 10 iterations (and probably several weeks).
Damyan, thanks, do you have source package too so I can rebuild it on my machine? I've Squeeze running on Amd64 hardware, but installed as i686, because I have lot of i386 binaries to run on this box.
--
To UNSUBSCRIBE, email to debian-bugs-dist-***@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact ***@lists.debian.org
Damyan Ivanov
2011-11-14 19:50:01 UTC
Permalink
-=| Chiefly Izzy, 14.11.2011 20:16:02 +0100 |=-
Post by Chiefly Izzy
Post by Damyan Ivanov
I have prepared packages from the current version in wheezy
(2.5.1.26351), built for squeeze.
ftp://ftp.modsoftsys.net/public/firebird2.5-backports/
If these improve the situation, then we at least know the bug is fixed
in that version and can start looking for the exact fix, which would
take about 10 iterations (and probably several weeks).
Damyan, thanks, do you have source package too so I can rebuild it
on my machine? I've Squeeze running on Amd64 hardware, but installed
as i686, because I have lot of i386 binaries to run on this box.
Grab the .dsc and .debian.tar.gz files from the above location. The
upstream sources are the same as in wheezy, i.e.
ftp://ftp.debian.org/debian/pool/main/f/firebird2.5/firebird2.5_2.5.1.26351.ds4.orig.tar.gz

Having all three in the current directory, run
dpkg-source -x firebird2.5_2.5.1.26351.ds4-2~bpo60+1.dsc
and you'll get an unpacked source (with all the patches applied). Run
dpkg-buildpackage in the unpacked directory and wait (-j works for
using multiple processors).
Chiefly Izzy
2011-11-14 21:10:02 UTC
Permalink
On 14. 11. 2011, at 20:44, Damyan Ivanov wrote:

Thanks both for the steps. Rebuilded, installed, testing ... Will let you know +- Wednesday, when I'll connect another bunch of computers (= clients) to our network and it will be heavier load than it is now. I'll test on tomorrow with two clients and we will see if it will crash or not.
--
To UNSUBSCRIBE, email to debian-bugs-dist-***@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact ***@lists.debian.org
Chiefly Izzy
2011-11-16 12:50:01 UTC
Permalink
Hi all,

so far, it looks pretty stable for today. Since 7 AM (it's 2 PM now) we
have started additional clients (4 now instead of 1 before) and there are
no crashes and everything does work as expected.

No freezed fb_inet_server processes and still some crashes in logs ...

***@server:~# tail -f /var/log/messages
Nov 16 08:15:08 server kernel: [630501.034377] fb_inet_server[15875]:
segfault at b1dfda88 ip b1dfda88 sp b60db34c error 4 in
gconv-modules.cache[b1e7a000+7000]
Nov 16 08:18:03 server kernel: [630675.616160] fb_inet_server[15919]:
segfault at b1d06a88 ip b1d06a88 sp b5fd434c error 4 in
gconv-modules.cache[b1d73000+7000]
Nov 16 08:43:51 server kernel: [632221.925028] fb_inet_server[15978]:
segfault at b1e69a88 ip b1e69a88 sp b616834c error 4 in
gconv-modules.cache[b1f06000+7000]
Nov 16 08:44:42 server kernel: [632273.510958] fb_inet_server[15987]:
segfault at b1d21a88 ip b1d21a88 sp b602034c error 4 in
gconv-modules.cache[b1dbe000+7000]
Nov 16 08:48:44 server kernel: [632515.336123] fb_inet_server[16004]:
segfault at b1e8ca88 ip b1e8ca88 sp b618b34c error 4
Nov 16 09:23:55 server kernel: [634624.602696] fb_inet_server[16054]:
segfault at b1e5fa88 ip b1e5fa88 sp b614e34c error 4
Nov 16 10:16:56 server kernel: [637802.845802] fb_inet_server[16103]:
segfault at b1e89a88 ip b1e89a88 sp b618834c error 4 in
gconv-modules.cache[b1f26000+7000]
Nov 16 10:46:36 server kernel: [639580.911182] fb_inet_server[16171]:
segfault at b1d17a88 ip b1d17a88 sp b601534c error 4 in
gconv-modules.cache[b1db4000+7000]
Nov 16 10:47:13 server kernel: [639617.967977] fb_inet_server[16186]:
segfault at b1d18a88 ip b1d18a88 sp b600734c error 4
Nov 16 10:48:14 server kernel: [639678.277943] fb_inet_server[16188]:
segfault at b1e37a88 ip b1e37a88 sp b613634c error 4

... but as you wrote before, they're innocent and appear when client closes
connection only. No other crashes around.

So, it's classic for today. I'll leave it on classic till midnight and if
it will work without problems till midnight, I'll replace classic with
-superclassic, same round for one day, and then -super for another day.

I'll let you know. But for now, does work perfectly.

Feel free to send me anything to test. I'll spend tomorrow and day after
tomorrow with -superclassic and -super testing.

marius adrian popa
2011-11-14 19:50:02 UTC
Permalink
Post by Chiefly Izzy
Post by Damyan Ivanov
I have prepared packages from the current version in wheezy
(2.5.1.26351), built for squeeze.
  ftp://ftp.modsoftsys.net/public/firebird2.5-backports/
If these improve the situation, then we at least know the bug is fixed
in that version and can start looking for the exact fix, which would
take about 10 iterations (and probably several weeks).
Damyan, thanks, do you have source package too so I can rebuild it on my machine? I've Squeeze running on Amd64 hardware, but installed as i686, because I have lot of i386 binaries to run on this box.
Here are the instructions to rebuild it on your machine , if you have
it on 32bits

http://jimicompot.blogspot.com/2011/11/rebuilding-firebird-251-from-stable-to.html
--
To UNSUBSCRIBE, email to debian-bugs-dist-***@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact ***@lists.debian.org
Loading...