JBoss Remoting 2.2.2里的一個Bug
在JBoss Remoting 2.2.2中存在這么一個bug,如果剛好客戶端的timeout比服務器端處理時間短的話,就會出現客戶端連接池中的連接被無故用掉一個的狀況,而且是沒法回收的,最終就會導致很快客戶端的連接池被占滿的現象,在分析JBoss Remoting 2.2.2的代碼后發現了問題的所在,同時查看了下JBoss Remoting 2.4的代碼,發現在2.4中此bug已被修復。
來看下JBoss Remoting 2.2.2中有問題的這段代碼的片斷:
在這里JBoss Remoting過于相信createClientSocket這行代碼了,jboss remoting認為這行代碼是不可能拋出異常的,但事實上其實這行是有可能會拋出異常的,可以想下,如果這行代碼執行拋出異常的話,會造成的現象就是之前說的,客戶端連接池中的連接被占用了一個,而且沒有回收的地方。
所以最簡單的方法自然是在pooled=createClientSocket(socket, address.timeout, metadata);這行代碼上增加捕捉try...catch,如果有異常拋出的話,則將usedPooled--;就像之前createSocket那個地方一樣。
在JBoss Remoting 2.4中,jboss不再采用usedPooled這個long型加上usedPoolLock這個對象鎖的方式來控制連接池,而是改為了采用更簡單好用的Semphore,不過用的還是EDG包的,而不是java 5的,來看看jboss remoting 2.4中的這段代碼改成什么樣了:
這樣自然是不會再出現2.2.2版本里的那個bug了。
:),由于2.4是重構為了采用semphore,不知道這個bug是剛好湊巧被這樣修復了呢,還是知道了這個bug進行fixed,呵呵,不管如何,總之bug是被修訂了。
這個bug對于使用jboss remoting的同學們而言還是要引起注意的,因為jboss remoting 2.2.2是jboss as 4.2.2中默認帶的版本。
從上面的代碼可以看到使用semphore這樣的方式來控制并發的需要限制大小的數據結構是非常好的,簡單易用,以前的那些long+Object Lock的方式實在是繁瑣,另外這個bug也給大家提了醒,在并發的這些資源的控制上千萬要注意鎖以及釋放的點,千萬不要主觀的認為某些代碼是絕對不會出問題的。
來看下JBoss Remoting 2.2.2中有問題的這段代碼的片斷:
synchronized(usedPoolLock)
{
if (pooled != null)
{
usedPooled++;
if (trace) log.trace(this + " got a socket, usedPooled: " + usedPooled);
break;
}
if (usedPooled < maxPoolSize)
{
// Try to get a socket.
if (trace) log.trace(this + " getting a socket, usedPooled: " + usedPooled);
usedPooled++;
}
else
{
retry = true;
if (trace) log.trace(this + " will try again to get a socket");
}
}
Socket socket = null;
long timestamp = System.currentTimeMillis();
try
{
if (trace) { log.trace(this + " creating socket " + (counter++) + ", attempt " + (i + 1)); }
socket = createSocket(address.address, address.port, timeRemaining);
if (trace) log.trace(this + " created socket: " + socket);
}
catch (Exception ex)
{
log.debug(this + " got Exception " + ex + ", creation attempt took " +
(System.currentTimeMillis() - timestamp) + " ms");
synchronized(usedPoolLock)
{
usedPooled--;
}
if (i + 1 < numberOfRetries)
{
Thread.sleep(1);
continue;
}
throw ex;
}
socket.setTcpNoDelay(address.enableTcpNoDelay);
Map metadata = getLocator().getParameters();
if (metadata == null)
{
metadata = new HashMap(2);
}
else
{
metadata = new HashMap(metadata);
}
metadata.put(SocketWrapper.MARSHALLER, marshaller);
metadata.put(SocketWrapper.UNMARSHALLER, unmarshaller);
if (timeAllowed > 0)
{
timeRemaining = (int) (timeAllowed - (System.currentTimeMillis() - start));
if (timeRemaining <= 0)
break;
metadata.put(SocketWrapper.TEMP_TIMEOUT, new Integer(timeRemaining));
}
pooled = createClientSocket(socket, address.timeout, metadata);
這段代碼的問題出在哪呢,就出在最后一行,或者說出在前面實現給usedPooled++上也可以。{
if (pooled != null)
{
usedPooled++;
if (trace) log.trace(this + " got a socket, usedPooled: " + usedPooled);
break;
}
if (usedPooled < maxPoolSize)
{
// Try to get a socket.
if (trace) log.trace(this + " getting a socket, usedPooled: " + usedPooled);
usedPooled++;
}
else
{
retry = true;
if (trace) log.trace(this + " will try again to get a socket");
}
}
Socket socket = null;
long timestamp = System.currentTimeMillis();
try
{
if (trace) { log.trace(this + " creating socket " + (counter++) + ", attempt " + (i + 1)); }
socket = createSocket(address.address, address.port, timeRemaining);
if (trace) log.trace(this + " created socket: " + socket);
}
catch (Exception ex)
{
log.debug(this + " got Exception " + ex + ", creation attempt took " +
(System.currentTimeMillis() - timestamp) + " ms");
synchronized(usedPoolLock)
{
usedPooled--;
}
if (i + 1 < numberOfRetries)
{
Thread.sleep(1);
continue;
}
throw ex;
}
socket.setTcpNoDelay(address.enableTcpNoDelay);
Map metadata = getLocator().getParameters();
if (metadata == null)
{
metadata = new HashMap(2);
}
else
{
metadata = new HashMap(metadata);
}
metadata.put(SocketWrapper.MARSHALLER, marshaller);
metadata.put(SocketWrapper.UNMARSHALLER, unmarshaller);
if (timeAllowed > 0)
{
timeRemaining = (int) (timeAllowed - (System.currentTimeMillis() - start));
if (timeRemaining <= 0)
break;
metadata.put(SocketWrapper.TEMP_TIMEOUT, new Integer(timeRemaining));
}
pooled = createClientSocket(socket, address.timeout, metadata);
在這里JBoss Remoting過于相信createClientSocket這行代碼了,jboss remoting認為這行代碼是不可能拋出異常的,但事實上其實這行是有可能會拋出異常的,可以想下,如果這行代碼執行拋出異常的話,會造成的現象就是之前說的,客戶端連接池中的連接被占用了一個,而且沒有回收的地方。
所以最簡單的方法自然是在pooled=createClientSocket(socket, address.timeout, metadata);這行代碼上增加捕捉try...catch,如果有異常拋出的話,則將usedPooled--;就像之前createSocket那個地方一樣。
在JBoss Remoting 2.4中,jboss不再采用usedPooled這個long型加上usedPoolLock這個對象鎖的方式來控制連接池,而是改為了采用更簡單好用的Semphore,不過用的還是EDG包的,而不是java 5的,來看看jboss remoting 2.4中的這段代碼改成什么樣了:
boolean timedout = !semaphore.attempt(timeToWait);
if (trace) log.trace(this + " obtained semaphore: " + semaphore.permits());
if (timedout)
{
throw new IllegalStateException("Timeout waiting for a free socket");
}
SocketWrapper pooled = null;
if (tryPool)
{
synchronized (pool)
{
// if connection within pool, use it
if (pool.size() > 0)
{
pooled = getPooledConnection();
if (trace) log.trace(this + " reusing pooled connection: " + pooled);
}
}
}
else
{
if (trace) log.trace(this + " avoiding connection pool, creating new socket");
}
if (pooled == null)
{
//Need to create a new one
Socket socket = null;
if (trace) { log.trace(this + " creating socket "); }
// timeAllowed < 0 indicates no per invocation timeout has been set.
int timeRemaining = -1;
if (0 <= timeAllowed)
{
timeRemaining = (int) (timeAllowed - (System.currentTimeMillis() - start));
}
socket = createSocket(address.address, address.port, timeRemaining);
if (trace) log.trace(this + " created socket: " + socket);
socket.setTcpNoDelay(address.enableTcpNoDelay);
Map metadata = getLocator().getParameters();
if (metadata == null)
{
metadata = new HashMap(2);
}
else
{
metadata = new HashMap(metadata);
}
metadata.put(SocketWrapper.MARSHALLER, marshaller);
metadata.put(SocketWrapper.UNMARSHALLER, unmarshaller);
if (timeAllowed > 0)
{
timeRemaining = (int) (timeAllowed - (System.currentTimeMillis() - start));
if (timeRemaining <= 0)
throw new IllegalStateException("Timeout creating a new socket");
metadata.put(SocketWrapper.TEMP_TIMEOUT, new Integer(timeRemaining));
}
pooled = createClientSocket(socket, address.timeout, metadata);
}
return pooled;
從以上代碼可以看到,JBoss首先是通過semphore.attempt的方式來獲取信號量鎖,然后就在下面的所有代碼中都不做異常的捕捉,jboss在這里改為了在外面統一捕捉這個方法的所有異常,并在有異常的情況下再調用semphore.release():if (trace) log.trace(this + " obtained semaphore: " + semaphore.permits());
if (timedout)
{
throw new IllegalStateException("Timeout waiting for a free socket");
}
SocketWrapper pooled = null;
if (tryPool)
{
synchronized (pool)
{
// if connection within pool, use it
if (pool.size() > 0)
{
pooled = getPooledConnection();
if (trace) log.trace(this + " reusing pooled connection: " + pooled);
}
}
}
else
{
if (trace) log.trace(this + " avoiding connection pool, creating new socket");
}
if (pooled == null)
{
//Need to create a new one
Socket socket = null;
if (trace) { log.trace(this + " creating socket "); }
// timeAllowed < 0 indicates no per invocation timeout has been set.
int timeRemaining = -1;
if (0 <= timeAllowed)
{
timeRemaining = (int) (timeAllowed - (System.currentTimeMillis() - start));
}
socket = createSocket(address.address, address.port, timeRemaining);
if (trace) log.trace(this + " created socket: " + socket);
socket.setTcpNoDelay(address.enableTcpNoDelay);
Map metadata = getLocator().getParameters();
if (metadata == null)
{
metadata = new HashMap(2);
}
else
{
metadata = new HashMap(metadata);
}
metadata.put(SocketWrapper.MARSHALLER, marshaller);
metadata.put(SocketWrapper.UNMARSHALLER, unmarshaller);
if (timeAllowed > 0)
{
timeRemaining = (int) (timeAllowed - (System.currentTimeMillis() - start));
if (timeRemaining <= 0)
throw new IllegalStateException("Timeout creating a new socket");
metadata.put(SocketWrapper.TEMP_TIMEOUT, new Integer(timeRemaining));
}
pooled = createClientSocket(socket, address.timeout, metadata);
}
return pooled;
try
{
boolean tryPool = retryCount < (numberOfCallRetries - 1)
|| maxPoolSize == 1
|| numberOfCallRetries == 1;
long l = System.currentTimeMillis();
socketWrapper = getConnection(marshaller, unmarshaller, tryPool, timeLeft);
long d = System.currentTimeMillis() - l;
if (trace) log.trace("took " + d + " ms to get socket " + socketWrapper);
}
catch (Exception e)
{
// if (bailOut)
// return null;
semaphore.release();
if (trace) log.trace(this + " released semaphore: " + semaphore.permits(), e);
sockEx = new CannotConnectException(
"Can not get connection to server. Problem establishing " +
"socket connection for " + locator, e);
continue;
}
{
boolean tryPool = retryCount < (numberOfCallRetries - 1)
|| maxPoolSize == 1
|| numberOfCallRetries == 1;
long l = System.currentTimeMillis();
socketWrapper = getConnection(marshaller, unmarshaller, tryPool, timeLeft);
long d = System.currentTimeMillis() - l;
if (trace) log.trace("took " + d + " ms to get socket " + socketWrapper);
}
catch (Exception e)
{
// if (bailOut)
// return null;
semaphore.release();
if (trace) log.trace(this + " released semaphore: " + semaphore.permits(), e);
sockEx = new CannotConnectException(
"Can not get connection to server. Problem establishing " +
"socket connection for " + locator, e);
continue;
}
這樣自然是不會再出現2.2.2版本里的那個bug了。
:),由于2.4是重構為了采用semphore,不知道這個bug是剛好湊巧被這樣修復了呢,還是知道了這個bug進行fixed,呵呵,不管如何,總之bug是被修訂了。
這個bug對于使用jboss remoting的同學們而言還是要引起注意的,因為jboss remoting 2.2.2是jboss as 4.2.2中默認帶的版本。
從上面的代碼可以看到使用semphore這樣的方式來控制并發的需要限制大小的數據結構是非常好的,簡單易用,以前的那些long+Object Lock的方式實在是繁瑣,另外這個bug也給大家提了醒,在并發的這些資源的控制上千萬要注意鎖以及釋放的點,千萬不要主觀的認為某些代碼是絕對不會出問題的。
posted on 2008-06-30 18:46 BlueDavy 閱讀(4801) 評論(0) 編輯 收藏 所屬分類: Java