char *s = "hello"和char s1[]="world"的區別

水木上的帖子

s 是全局數據區，
s1 是函數的棧空間區域，函數執行完成，這個空間就沒了
------------------------------------------------------------------------------------------
7.2 可是我聽說 char a[ ] 和 char *a 是一樣的。
并非如此。(你所聽說的應該跟函數的形式參數有關；參見問題 6.4) 數組不是指針。數組定義 char a[6] 請求預留 6 個字符的位置, 并用名稱 ``a" 表示。也就是說, 有一個稱為 ``a" 的位置, 可以放入 6 個字符。而指針申明 char *p, 請求一個位置放置一個指針, 用名稱 ``p" 表示。這個指針幾乎可以指向任何位置: 任何字符和任何連續的字符, 或者哪里也不指(參見問題 5.1 和 1.10)。

一個圖形勝過千言萬語。聲明

   char a[] = "hello";
   char *p = "world";

將會初始化下圖所示的數據結果:
       +---+---+---+---+---+---+
   a: | h | e | l | l | o |\0 |
   +---+---+---+---+---+---+
   +-----+ +---+---+---+---+---+---+
   p: | *======> | w | o | r | l | d |\0 |
   +-----+ +---+---+---+---+---+---+

根據 x 是數組還是指針, 類似 x[3] 這樣的引用會生成不同的代碼。認識到這一點大有裨益。以上面的聲明為例, 當編譯器看到表達式 a[3] 的時候, 它生成代碼從 a 的位置開始跳過 3 個, 然后取出那個字符. 如果它看到 p[3], 它生成代碼找到 ``p" 的位置, 取出其中的指針值, 在指針上加 3 然后取出指向的字符。換言之, a[3] 是名為 a 的對象 (的起始位置) 之后 3 個位置的值, 而 p[3] 是 p 指向的對象的 3 個位置之后的值. 在上例中, a[3] 和 p[3] 碰巧都是 'l' , 但是編譯器到達那里的途徑不盡相同。本質的區別在于類似 a 的數組和類似 p 的指針一旦在表達式中出現就會按照不同的方法計算, 不論它們是否有下標。下一問題繼續深入解釋。參見問題 1.13。

參考資料： [K&R2, Sec. 5.5 p. 104]; [CT&P, Sec. 4.5 pp. 64-5]。

7.3 那么, 在 C 語言中 ``指針和數組等價" 到底是什么意思？
在 C 語言中對數組和指針的困惑多數都來自這句話。說數組和指針 ``等價" 不表示它們相同, 甚至也不能互換。它的意思是說數組和指針的算法定義可以用指針方便的訪問數組或者模擬數組。
特別地, 等價的基礎來自這個關鍵定義:

一個 T 的數組類型的左值如果出現在表達式中會蛻變為一個指向數組第一個成員的指針(除了三種例外情況); 結果指針的類型是 T 的指針。
這就是說, 一旦數組出現在表達式中, 編譯器會隱式地生成一個指向數組第一個成員地指針, 就像程序員寫出了 &a[0] 一樣。例外的情況是, 數組為 sizeof 或 & 操作符的操作數, 或者為字符數組的字符串初始值。

作為這個這個定義的后果, 編譯器并那么不嚴格區分數組下標操作符和指針。在形如 a[i] 的表達式中, 根據上邊的規則, 數組蛻化為指針然后按照指針變量的方式如 p[i] 那樣尋址, 如問題 6.2 所述, 盡管最終的內存訪問并不一樣。如果你把數組地址賦給指針:

   p = a;

那么 p[3] 和 a[3] 將會訪問同樣的成員。
參見問題 6.6 和 6.11。
------------------------------------------------------------------------------------------
char *s1 = "hello";
char s2[6] = "hello";

類型指針與類型數組名在很多場合中可等價使用。容易給人造成的印象是兩者是等價。
這話不盡然。首先我們要明白這是兩個不同的東西。

s1的類型char *，而s2的類型是array of char。
s1初始化為一個指針值，指向一個內存區域，該處有6個字符的數據，
即'h', 'e', 'l', 'l', 'o', '\0'。在運行過程中，s1的值可改變，指向其他任何允許的地址。
但上面的數據("hello")不會在程序退出之前銷毀[注：這是另外一個比較迷惑人的細節]，
即使s1變量生命周期結束。

s2初始化為6個字符的數組，也是'h', 'e', 'l', 'l', 'o', '\0'。在運行過程中，s2的內容可改變，
也就是存儲在s2中的hello也就"消失"了。

但為什么容易給人造成類型指針與類型數組名可等價的疑惑呢？雖然類型不同，但C規定（為了
追求簡潔與靈活性，C假設使用者知道自己代碼會有什么結果。）在很多場合下，認為數組名
與類型指針類型兼容。記憶中只有2中情況下，數組名不可等同視為數組指針，&與sizeof操作符。

   void foo1(const char *str) {...};
   void foo2(int size) {return size};
   ...
   char *s1 = "hello";
   char s2[6] = "hello";

   foo1(s1); // ok
   foo1(s2); // ok
   foo1(&s2); // incompatible
   foo2(&s2[0]); // ok

   s1[0] = 0; // error
   s2[0] = 0; // ok

   s1 = s2; // ok
   s2 = s1; // error

   // 下面假設在ia32平臺上運行
   foo2(sizeof(s1)); // return 4, pointer size
   foo2(sizeof(s2)); // return 6, array size

只記得上面的這些內容，不知道對錯，與大家共同提高。

posted @ 2007-10-01 01:40 ZelluX 閱讀(2229) | 評論 (0) | 編輯收藏

C++ 學習筆記 (6)

C++ 學習筆記(6)
1. 類聲明中定義的函數都被自動處理為inline函數。

2. Triangular t5(); 這句話似乎聲明了一個類的實例，但事實上，C++為了保持與C語法的一致，該語句會被解釋成一個返回Triangular對象的函數；正確的聲明應該去掉()。

3. 考慮下面一段代碼
class Matrix {
public:
    Matrix( int row, int col ) // ...
    ~Matrix() { delete [] _pmat; }
private:
    int _row, _col;
    double *_pmat;
};

{
    Matrix mat(4, 4);
    {
        Matrix mat2 = mat;
    }
}
    把mat復制給mat2后，mat2中的_pmat與mat的_pmat指向同一個數組，在mat2所在的域結束后，mat2被釋放，同時刪除了_pmat指針指向的內容。錯誤發生。
    解決辦法是在Matrix::Matrix( const Matrix &rhs )中詳細指出深層拷貝的方法，同時實現賦值操作符的重載。

4. mutable 和 const
const方法無法修改類的成員，mutable成員除外。

posted @ 2007-10-01 01:08 ZelluX 閱讀(391) | 評論 (0) | 編輯收藏

C++ 入門筆記 (5)

1. 查看某個類型的最大最小值：
#include <limits>
int max_int = numeric_limits<int>::max();
double min_dbl = numeric_limits<double>::max();

2. A struct is simply a class whose members are public by default.

3. 類的似有成員的保護依賴于對成員名使用的限制，也就是說，通過地址操作、強制類型轉換等技術可以被訪問，C++可以防止意外的發生，但不能阻止這種故意的欺騙。只有硬件層面才有可能防止這種惡意訪問，但通常在現實中也很難做到。

4. 使用類的初始化列表代替代碼體中的賦值語句可以節省很多時間。而且，如果一個數據成員是const的，那么它只能在初始化列表中進行初始化；如果一個數據成員是不具有零參數的構造函數的類類型，那么它也必須在初始化列表中進行初始化。

posted @ 2007-09-24 20:45 ZelluX 閱讀(364) | 評論 (0) | 編輯收藏

LL(1) 語法解析

還是做一點筆記，記得牢一些

有了follow和first集合后，就可以構造一張預測解析表(predictive parsing table)了。
具體方法是：
對于任一產生式X -> ƒ，找到first(ƒ)中的每一個元素T，把X -> ƒ填充到X行T列中去；
如果ƒ nullable，還要把X -> ƒ填充到X行follow(ƒ)列中去

預測解析表構造完成后，如果某格中不止一個產生式，則說明該語法不適用于預測解析表；
如果每格至多一個產生式，則該語法被稱為LL(1) Left-to-right parse, Leftmost-derivation, 1-symbol lookahead

posted @ 2007-09-24 19:34 ZelluX 閱讀(611) | 評論 (0) | 編輯收藏

zz 整理的內核學習計劃

開始看lkd，不過最新可以抽出來的時間不多，下星期Java課還有project，還要寫一份robocode的項目文檔，然后編譯課也得跟上，英語高口課還得準備資料。。。先轉一篇學習計劃過來

發信人: ccedcced (turing), 信區: KernelTech
標題: 整理的內核學習計劃
發信站: 水木社區 (Fri Jan 6 18:15:35 2006), 轉信

內核學習現在開始
方法就是：
讀書（lkd）+源代碼閱讀-->項目

先從基礎做起！
希望大家都能積極地參與討論，寫讀書筆記和源碼閱讀筆記！
------------------------------
內核學習第一周計劃--Are you ready?

時間:11月2號--11月5日
閱讀內容：前言和第一章
導讀：這是比較容易閱讀的開頭部分，內容比較少而且常識性的內容比較多。請在閱讀的過程中考慮如下的問題：

1.linux和unix相比有哪些特點？
2.內核編程和用戶空間編程相比有哪些不同之處？
3.自己編譯一下內核，你編譯成功了么？如果不成功，有什么問題？使用你新編譯的內核，
  能順利啟動么？有什么問題？
4.linux內核源代碼樹中你能找到sg設備驅動是在那個文件中實現的么？sg是什么含意?
  清楚地了解一下內核中源代碼樹的結構。
Are you ready?
------------------------------
內核學習第二周計劃
時間:11.7-11.13
內容:
主要是lkd中文版第一版第二章(英文第二版版第三章)的內容，比較重要。

1.和進程管理相關的內核文件都有哪些?找出來大致瀏覽一下.
2.什么是進程和線程?在Linux中有什么獨特的地方？
3.什么是進程描述符？怎樣得到當前進程的進程描述符？進程的內核棧有多大？
4.進程的狀態都有哪些？在什么情況下發生轉化？
5.Linux中所有進程之間的關系是怎么樣的？
6.用戶線程和內核線程的區別和聯系？
7.Linux是怎樣創建進程和線程的？
8.Linux怎樣終結進程？
9.對照相應的內核源代碼文件，分析一下問題3、5、6、7。
------------------------------
內核學習第三、四周計劃
內核學習第三、四周計劃：

時間：11.14-11.27

調度是操作系統中非常重要的部分，也是最核心的東西。通過學習這一章，希望大家都能大致掌握Linux的調度策略，以及為什么要這么做，Linux相比其它的系統有什么優點。重點多分析源代碼中的算法實現！！我希望我能有時間和大家一起分析代碼，也希望大家積極主動地分析一些代碼片斷。

學習內容：
1.進程調度最基本的原理是什么？
2.列舉出幾個I/O消耗性和處理器消耗型的進程。
3.Linux都采用了哪些調度的算法？詳細解釋一下這些算法。
4.進程什么時間進入運行態？什么時間進入休眠（阻塞）狀態？
5.了解進程搶占的算法；
6.討論一下Linux進程調度的實時性怎么樣，還有哪些需要提高的地方？
7.自己查找進程調度的相關文件，分析為題3-6。
------------------------------
內核學習第五周計劃
時間：11.28--12.5

1、什么是系統調用？

2、為什么需要系統調用？

3、實現系統調用相關的代碼有哪些，找出來瀏覽一下

4、詳細閱讀getuid()這一下系統調用的實現代碼

5、如何導出sys_call_table，有幾種方法，注意不同內核版本的區別

6、嘗試自己給kernel添加一個簡單的系統調用
   功能要求：調用這個系統調用，使用戶的uid等于0。（這個題目取自《邊干邊學》）
7、采用添加系統調用的方式實現一個新功能的利弊有哪些，替代方法是什么？
------------------------------
內核學習第六周計劃

這一周就總結了一下中斷和下半部相關的知識點，供大家參考一下！同時附件里有我以前學習時候的筆記，寫的不好請多多指教。
內核學習第六周計劃:
1、如何理解中斷、中斷上下文和進程上下文的區別、為何中斷不能睡眠
2、關于x86中選擇子、描述符和各種門的理解
3、查閱相關資料和內核源碼理解：
中斷是如何發生以及硬件和內核是如何相應的，如何返回的
x86上中斷發生時上下文(寄存器)如何保存以及中斷返回時上下文如何恢復的，系統的第一個任務是如何啟動的

4、內核中安排下半部的理由
5、軟中斷及其他的下半部策略適用于什么樣的任務和場合？
6、下半部可以睡眠么？為什么？
7、2.4和2.6內核中下半部包括哪些部分，為何2.6內核相比2.4內核會做這樣的改進
8、閱讀內核中關于軟中斷、tasklet以及工作隊列的代碼、相關書籍和資料，總結如下兩個問題:
軟中斷、tasklet以及工作隊列是如何初始化，注冊以及觸發的，使用了哪些關鍵的數據結構及內核變量？
軟中斷、tasklet以及工作隊列都在什么場合下使用？
------------------------------
內核學習第七周計劃
內核學習第七周計劃：
時間：12月13日--12月19日
內容：內核同步的理論知識。

問題：
1.為什么要進行內核的同步？
2.內核中怎么定義原子操作？
3.競爭產生的條件與加鎖的順序？
4.要保護的對象？
5.死鎖產生的條件與解決辦法？
6.你有什么比較好的方法來調試多線程的程序？
7.據一個內核中產生競爭的例子。
------------------------------
內核學習第八周計劃
第八周：
內容:
上一周我們分析了內核中的同步，這一次我們要接觸的是內核中怎么進行同步和互斥。

問題：
1.原子操作的粒度問題；
2.自旋鎖的設計及其應用場合,分析自旋鎖；
3.信號量及其應用，信號量怎么使用？
4.鎖的粒度以及其分類；
5.內核可搶占性的實現及其與鎖的關系；
6.smp中需要考慮哪些同步與互斥；
------------------------------
內核學習第九周計劃
前面的話：
   內核學習已經進行了兩個月的時間，LKD這本書我們基本上已經進行了一半。希望大家在下面多用功。
我們在這個計劃里面列出來的只是一些比較基本的問題，而且我們沒有給出問題的答案，但是只要你在下面
用功了，我想答案就在這本書里和源代碼里面。

本周內容：TIMERS AND TIME MANAGEMENT

1.HZ和jiffies值的定義?
2.內核中怎樣解決jiffies的回繞？為什么這樣可以解決jiffies回繞？
3.時鐘中斷處理程序有哪些值得注意的地方？
4.xtime_lock鎖和seqlock鎖?
5.定時器的實現、使用和競爭條件？
6.udelay()&mdelay()?
------------------------------
內核學習第十周計劃：
內存管理
   學習內容
   內存管理是比較龐大的一個部分，在lkd這本書中用了很少的篇幅，從這里面我們基本能看清楚內存管理的概貌。《情景分析》一書關于內存管理的部分講得比較多，代碼分析比較透徹也比較深入。但是相對的難度也比較大，建議先看看lkd這本書，然后再看《情景分析》一書的內存管理。

   問題：
   1.內核中內存的分頁、分區；
   2.內核中有哪些函數來獲得內存？內核中分配內存要注意什么？
   3.為什么使用slab?slab對象的詳細分析。
   4.內核棧上內存的靜態分配問題；

posted @ 2007-09-24 11:22 ZelluX 閱讀(657) | 評論 (0) | 編輯收藏

總算弄明白生成first和follow集合的算法了

太笨了，看了好久才明白。。。
first集合沒有問題

follow集合：
以A -> aBb為例，如果b nullable，則follow(B)包括follow(A)，原因很簡單，把A看成一個整體，當作為production的右式時它后面直接跟的元素自然也可能是B后面直接跟的元素，因為b可能為空。

理解follow集合的定義后，虎書上給出的算法
if Yi+1 ... Yk are all nullable
then FOLLOW(Yi) = FOLLOW(Yi) U FOLLOW(X)
if Yi+1 ... Yj-1 are all nullable
then FOLLOW(Yi) = FOLLOW(Yi) U FIRST(Yj)
就不難理解了

posted @ 2007-09-23 22:22 ZelluX 閱讀(2444) | 評論 (2) | 編輯收藏

A Note on TMin

關于表示32位整型中最小的數-2147483648的一篇文章,from CSAPP http://csapp.cs.cmu.edu/public/tmin.html

Summary

Due to one of the rules for processing integer constants in ANSI C, the numeric constant -2147483648 is handled in a peculiar way on a 32-bit, two's complement machine.

The problem can be corrected by writing -2147483647-1, rather than -2147483648 in any C code.

Description of Problem

The ANSI C standard requires that an integer constant too large to be represented as a signed integer be ``promoted'' to an unsigned value. When GCC encounters the value 2147483648, it gives a warning message: ``warning: decimal constant is so large that it is unsigned.'' The result is the same as if the value had been written 2147483648U.
ANSI C標準中要求太大的整型常量都表示為unsigned

The compiler processes an expression of the form -X by first reading the expression X and then negating it. Thus, when the C compiler encounters the constant -2147483648, it first processes 2147483648, yielding 2147483648U, and then negates it. The unsigned negation of this value is also 2147483648U. The bit pattern is correct, but the type is wrong!

書上的一個小錯誤

Writing TMin in Code

The ANSI C standard states that the maximum and minimum integers should be declared as constants INT_MAX and INT_MIN in the file limits.h. Looking at this file on an IA32 Linux machine (in the directory /usr/include), we find the following declarations:

/* Minimum and maximum values a `signed int' can hold.  */

#define INT_MAX	2147483647

#define INT_MIN	(-INT_MAX - 1)

This method of declaring INT_MIN avoids accidental promotion and also avoids any warning messages by the C compiler about integer overflow.

The following are ways to write TMin_32 for a 32-bit machine that give the correct value and type, and don't cause any error messages:

-2147483647-1
(int) 2147483648U
1<<31

The first method is preferred, since it indicates that the result will be a negative number.
表示-2147483648的幾種方法

posted @ 2007-09-19 21:26 ZelluX 閱讀(552) | 評論 (1) | 編輯收藏

系統恢復記

原來只給機房電腦的Arch配了20G的空間，現在想裝個virtualbox，空間自然不夠了，于是到windows下刪了一個分區用于掛載/home的內容。
然后問題就出來了，由于少了一個分區，原來arch所在的分區(sda8)向前移了一位(sda7)，結果grub啟動出錯，無法進入系統。
從機房管理員那借來Redhat的安裝盤，進入rescue模式，掛在好/dev/sda7后，修改boot/grub/menu.lst中的盤符，重啟，問題依舊。
只好先恢復winxp再說，用winxp工具盤啟動后，fdisk /mbr，重啟，安裝grub for dos，再重啟，進入grub
grub> root (hd0,6)
grub> setup (hd0)
報錯，估計是dos下的問題，只好手動進入arch
grub> root (hd0,6)
grub> kernel /boot/vmlinuz26 root=/dev/sda7 ro
grub> initrd /boot/kernel26.img
grub> boot
進入后仍然有錯誤，說是sda8無法訪問，這時才想起fstab還沒改過，但是當前的掛載的sda7還是只讀的，只好再用redhat的啟動盤啟動，掛載sda7后修改fstab中的信息。

問題解決。

posted @ 2007-09-19 16:07 ZelluX 閱讀(319) | 評論 (0) | 編輯收藏

[zz] proftpd 學習筆記

摘要: 閱讀全文

posted @ 2007-09-18 23:16 ZelluX 閱讀(326) | 評論 (0) | 編輯收藏

Catalan 數

n個不同的物體按固定次序入棧，隨時可以出棧，求最后可能的出棧序列的總數。
只想到這個問題等價于把n個push和n個pop操作排列，要求任意前幾個操作中push數都不能少于pop數，至于這個排列數怎么求就不知道了。請教了peter大牛后，原來這就是一個Catalan數的應用。

Wikipedia上的Catalan numbers:

In combinatorial mathematics, the Catalan numbers form a sequence of natural numbers that occur in various counting problems, often involving recursively defined objects. They are named for the Belgian mathematician Eugène Charles Catalan (1814–1894).

The n^th Catalan number is given directly in terms of binomial coefficients by

$C_n = \frac{1}{n+1}{2n\choose n} = \frac{(2n)!}{(n+1)!\,n!} \qquad\mbox{ for }n\ge 0.$

The first Catalan numbers (sequence A000108 in OEIS) for n = 0, 1, 2, 3, … are

1, 1, 2, 5, 14, 42, 132, 429, 1430, 4862, 16796, 58786, 208012, 742900, 2674440, 9694845, 35357670, 129644790, 477638700, 1767263190, 6564120420, 24466267020, 91482563640, 343059613650, 1289904147324, 4861946401452, …

Properties

An alternative expression for C_n is

$C_n = {2n\choose n} - {2n\choose n-1} \quad\mbox{ for }n\ge 1.$

This shows that C_n is a natural number, which is not a priori obvious from the first formula given. This expression forms the basis for André's proof of the correctness of the formula (see below under second proof).

The Catalan numbers satisfy the recurrence relation

$C_0 = 1 \quad \mbox{and} \quad C_{n+1}=\sum_{i=0}^{n}C_i\,C_{n-i}\quad\mbox{for }n\ge 0.$

They also satisfy:

$C_0 = 1 \quad \mbox{and} \quad C_{n+1}=\frac{2(2n+1)}{n+2}C_n,$

which can be a more efficient way to calculate them.

Asymptotically, the Catalan numbers grow as

$C_n \sim \frac{4^n}{n^{3/2}\sqrt{\pi}}$

in the sense that the quotient of the n^th Catalan number and the expression on the right tends towards 1 for n → ∞. (This can be proved by using for n!.)

[edit]

Applications in combinatorics

There are many counting problems in combinatorics whose solution is given by the Catalan numbers. The book Enumerative Combinatorics: Volume 2 by combinatorialist Richard P. Stanley contains a set of exercises which describe 66 different interpretations of the Catalan numbers. Following are some examples, with illustrations of the case C₃ = 5.

C_n is the number of Dyck words of length 2n. A Dyck word is a string consisting of n X's and n Y's such that no initial segment of the string has more Y's than X's (see also Dyck language). For example, the following are the Dyck words of length 6:

XXXYYY XYXXYY XYXYXY XXYYXY XXYXYY.

Re-interpreting the symbol X as an open parenthesis and Y as a close parenthesis, C_n counts the number of expressions containing n pairs of parentheses which are correctly matched:

((())) ()(()) ()()() (())() (()())

C_n is the number of different ways n + 1 factors can be completely parenthesized (or the number of ways of associating n applications of a binary operator). For n = 3 for example, we have the following five different parenthesizations of four factors:

$((ab)c)d \quad (a(bc))d \quad(ab)(cd) \quad a((bc)d) \quad a(b(cd))$

Successive applications of a binary operator can be represented in terms of a binary tree. It follows that C_n is the number of rooted ordered binary trees with n + 1 leaves:

If the leaves are labelled, we have the quadruple factorial numbers.

C_n is the number of non-isomorphic full binary trees with n vertices that have children, usually called internal vertices or branches. (A rooted binary tree is full if every vertex has either two children or no children.)

C_n is the number of monotonic paths along the edges of a grid with n × n square cells, which do not cross the diagonal. A monotonic path is one which starts in the lower left corner, finishes in the upper right corner, and consists entirely of edges pointing rightwards or upwards. Counting such paths is equivalent to counting Dyck words: X stands for "move right" and Y stands for "move up". The following diagrams show the case n = 4:

Image:Catalan number 4x4 grid example.svg

C_n is the number of different ways a convex polygon with n + 2 sides can be cut into triangles by connecting vertices with straight lines. The following hexagons illustrate the case n = 4:

Error creating thumbnail:

C_n is the number of stack-sortable permutations of {1, ..., n}. A permutation w is called stack-sortable if S(w) = (1, ..., n), where S(w) is defined recursively as follows: write w = unv where n is the largest element in w and u and v are shorter sequences, and set S(w) = S(u)S(v)n, with S being the identity for one-element sequences.

C_n is the number of noncrossing partitions of the set { 1, ..., n }. A fortiori, C_n never exceeds the nth Bell number. C_n is also the number of noncrossing partitions of the set { 1, ..., 2n } in which every block is of size 2. The conjunction of these two facts may be used in a proof by mathematical induction that all of the free cumulants of degree more than 2 of the Wigner semicircle law are zero. This law is important in free probability theory and the theory of random matrices.

[edit]

Proof of the formula

There are several ways of explaining why the formula

$C_n = \frac{1}{n+1}{2n\choose n}$

solves the combinatorial problems listed above. The first proof below uses a generating function. The second and third proofs are examples of bijective proofs; they involve literally counting a collection of some kind of object to arrive at the correct formula.

[edit]

First proof

We start with the observation that several of the combinatorial problems listed above can easily be seen to satisfy the recurrence relation

$C_0 = 1 \quad \mbox{and} \quad C_{n+1}=\sum_{i=0}^{n}C_i\,C_{n-i}\quad\mbox{for }n\ge 0.$

For example, every Dyck word w of length ≥ 2 can be written in a unique way in the form

w = Xw₁Yw₂

with (possibly empty) Dyck words w₁ and w₂.

The generating function for the Catalan numbers is defined by

$c(x)=\sum_{n=0}^\infty C_n x^n.$

As explained in the article titled Cauchy product, the sum on the right side of the above recurrence relation is the coefficient of xⁿ in the product

$\left(\sum_{i=0}^\infty C_i x^i\right)^2.$

Therefore

$\left(\sum_{i=0}^\infty C_i x^i\right)^2 = \sum_{n=0}^\infty C_{n+1} x^n.$

Multiplying both sides by x, we get

$x\left(\sum_{i=0}^\infty C_i x^i\right)^2 = \sum_{n=0}^\infty C_{n+1} x^{n+1} = \sum_{n=1}^\infty C_n x^n = -1 + \sum_{n=0}^\infty C_n x^n.$

So we have

$c(x)=1+xc(x)^2,\,$

and hence

$c(x) = \frac{1-\sqrt{1-4x}}{2x}.$

The square root term can be expanded as a power series using the identity

$\sqrt{1+y} = 1 - 2\sum_{n=1}^\infty {2n-2 \choose n-1} \left(\frac{-1}{4}\right)^n \frac{y^n}{n} ,$

which can be proved, for example, by the binomial theorem, (or else directly by considering repeated derivatives of $\sqrt{1+y}$ ) together with judicious juggling of factorials. Substituting this into the above expression for c(x) produces, after further manipulation,

$c(x) = \sum_{n=0}^\infty {2n \choose n} \frac{x^n}{n+1.}$

Equating coefficients yields the desired formula for C_n.

[edit]

Second proof

This proof depends on a trick due to D. André, which is now more generally known as the reflection principle (not to be confused with the Schwarz reflection theorem in complex analysis). It is most easily expressed in terms of the "monotonic paths which do not cross the diagonal" problem (see above).

Figure 1. The green portion of the path is being flipped.

Suppose we are given a monotonic path in an n × n grid that does cross the diagonal. Find the first edge in the path that lies above the diagonal, and flip the portion of the path occurring after that edge, along a line parallel to the diagonal. (In terms of Dyck words, we are starting with a sequence of n X's and n Y's which is not a Dyck word, and exchanging all X's with Y's after the first Y that violates the Dyck condition.) The resulting path is a monotonic path in an (n − 1) × (n + 1) grid. Figure 1 illustrates this procedure; the green portion of the path is the portion being flipped.

Since every monotonic path in the (n − 1) × (n + 1) grid must cross the diagonal at some point, every such path can be obtained in this fashion in precisely one way. The number of these paths is equal to

${2n\choose n-1}$ .

Therefore, to calculate the number of monotonic n × n paths which do not cross the diagonal, we need to subtract this from the total number of monotonic n × n paths, so we finally obtain

${2n\choose n}-{2n\choose n-1}$

which is the nth Catalan number C_n.

[edit]

Third proof

The following bijective proof, while being more involved than the previous one, provides a more natural explanation for the term n + 1 appearing in the denominator of the formula for C_n.

Figure 2. A path with exceedance 5.

Suppose we are given a monotonic path, which may happen to cross the diagonal. The exceedance of the path is defined to be the number of pairs of edges which lie above the diagonal. For example, in Figure 2, the edges lying above the diagonal are marked in red, so the exceedance of the path is 5.

Now, if we are given a monotonic path whose exceedance is not zero, then we may apply the following algorithm to construct a new path whose exceedance is one less than the one we started with.

Starting from the bottom left, follow the path until it first travels above the diagonal.
Continue to follow the path until it touches the diagonal again. Denote by X the first such edge that is reached.
Swap the portion of the path occurring before X with the portion occurring after X.

The following example should make this clearer. In Figure 3, the black circle indicates the point where the path first crosses the diagonal. The black edge is X, and we swap the red portion with the green portion to make a new path, shown in the second diagram.

Figure 3. The green and red portions are being exchanged.

Notice that the exceedance has dropped from three to two. In fact, the algorithm will cause the exceedance to decrease by one, for any path that we feed it.

Figure 4. All monotonic paths in a 3×3 grid, illustrating the exceedance-decreasing algorithm.

It is also not difficult to see that this process is reversible: given any path P whose exceedance is less than n, there is exactly one path which yields P when the algorithm is applied to it.

This implies that the number of paths of exceedance n is equal to the number of paths of exceedance n − 1, which is equal to the number of paths of exceedance n − 2, and so on, down to zero. In other words, we have split up the set of all monotonic paths into n + 1 equally sized classes, corresponding to the possible exceedances between 0 and n. Since there are

${2n\choose n}$

monotonic paths, we obtain the desired formula

$C_n = \frac{1}{n+1}{2n\choose n}.$

Figure 4 illustrates the situation for n = 3. Each of the 20 possible monotonic paths appears somewhere in the table. The first column shows all paths of exceedance three, which lie entirely above the diagonal. The columns to the right show the result of successive applications of the algorithm, with the exceedance decreasing one unit at a time. Since there are five rows, C₃ = 5.

[edit]

Hankel matrix

The n×n Hankel matrix whose (i, j) entry is the Catalan number C_i+j has determinant 1, regardless of the value of n. For example, for n = 4 we have

$\det\begin{bmatrix}1 & 1 & 2 & 5 \\ 1 & 2 & 5 & 14 \\ 2 & 5 & 14 & 42 \\ 5 & 14 & 42 & 132\end{bmatrix} = 1$

Note that if the entries are ``shifted", namely the Catalan numbers C_i+j+1, the determinant is still 1, regardless of the size of n. For example, for n = 4 we have

$\det\begin{bmatrix}1 & 2 & 5 & 14 \\ 2 & 5 & 14 & 42 \\ 5 & 14 & 42 & 132 \\ 14 & 42 & 132 & 429 \end{bmatrix} = 1$ .

The Catalan numbers is the unique sequence with this property.

[edit]

Quadruple factorial

The quadruple factorial is given by $\frac{2n!}{n!}$ , or $\left(n+1\right)! C_n$ . This is the solution to labelled variants of the above combinatorics problems. It is entirely distinct from the multifactorials.

[edit]

History

The Catalan sequence was first described in the 18th century by Leonhard Euler, who was interested in the number of different ways of dividing a polygon into triangles. The sequence is named after Eugène Charles Catalan, who discovered the connection to parenthesized expressions. The counting trick for Dyck words was found by D. André in 1887.

posted @ 2007-09-17 19:55 ZelluX 閱讀(1495) | 評論 (2) | 編輯收藏

僅列出標題

Snowdream

char *s = "hello"和char s1[]="world"的區別

C++ 學習筆記 (6)

C++ 入門筆記 (5)

LL(1) 語法解析

zz 整理的內核學習計劃

總算弄明白生成first和follow集合的算法了

A Note on TMin

Summary

Description of Problem

Writing TMin in Code

系統恢復記

[zz] proftpd 學習筆記

Catalan 數

Properties

Applications in combinatorics

Proof of the formula

First proof

Second proof

Third proof

Hankel matrix

Quadruple factorial

History

日歷

常用鏈接

留言簿(21)

隨筆分類(390)

隨筆檔案(389)

文章檔案(7)

相冊

15ers

友情鏈接

收藏夾

搜索

積分與排名

最新隨筆

最新評論

閱讀排行榜

評論排行榜