有時候linux上的文字檔裡面的內容,每一行的結尾會出現^M,該如何消除咧??
利用vi下置換命令"%s/(要被置換的字串)/(改成這個字串)/g(g表示global)"
:%s/^M//g
^M 的輸入法為 先按Ctrl-V 再按 Ctrl-M
然後記得要存檔
2008年5月29日 星期四
研發筆記-移除文章中的^M
2008年5月27日 星期二
研發筆記-Coding Style
使用同一套coding style,可以讓整個工作團隊有所本,閱讀更加有效率
以下文件來自 Linux Kernel 的 Documentation/CodingStyle:
Linux kernel coding style
This is a short document describing the preferred coding style for the
linux kernel. Coding style is very personal, and I won't _force_ my
views on anybody, but this is what goes for anything that I have to be
able to maintain, and I'd prefer it for most other things too. Please
at least consider the points made here.
First off, I'd suggest printing out a copy of the GNU coding standards,
and NOT read it. Burn them, it's a great symbolic gesture.
Anyway, here goes:
Chapter 1: Indentation
Tabs are 8 characters, and thus indentations are also 8 characters.
There are heretic movements that try to make indentations 4 (or even 2!)
characters deep, and that is akin to trying to define the value of PI to
be 3.
Rationale: The whole idea behind indentation is to clearly define where
a block of control starts and ends. Especially when you've been looking
at your screen for 20 straight hours, you'll find it a lot easier to see
how the indentation works if you have large indentations.
Now, some people will claim that having 8-character indentations makes
the code move too far to the right, and makes it hard to read on a
80-character terminal screen. The answer to that is that if you need
more than 3 levels of indentation, you're screwed anyway, and should fix
your program.
In short, 8-char indents make things easier to read, and have the added
benefit of warning you when you're nesting your functions too deep.
Heed that warning.
Don't put multiple statements on a single line unless you have
something to hide:
if (condition) do_this;
do_something_everytime;
Outside of comments, documentation and except in Kconfig, spaces are never
used for indentation, and the above example is deliberately broken.
Get a decent editor and don't leave whitespace at the end of lines.
Chapter 2: Breaking long lines and strings
Coding style is all about readability and maintainability using commonly
available tools.
The limit on the length of lines is 80 columns and this is a hard limit.
Statements longer than 80 columns will be broken into sensible chunks.
Descendants are always substantially shorter than the parent and are placed
substantially to the right. The same applies to function headers with a long
argument list. Long strings are as well broken into shorter strings.
void fun(int a, int b, int c)
{
if (condition)
printk(KERN_WARNING "Warning this is a long printk with "
"3 parameters a: %u b: %u "
"c: %u \n", a, b, c);
else
next_statement;
}
Chapter 3: Placing Braces
The other issue that always comes up in C styling is the placement of
braces. Unlike the indent size, there are few technical reasons to
choose one placement strategy over the other, but the preferred way, as
shown to us by the prophets Kernighan and Ritchie, is to put the opening
brace last on the line, and put the closing brace first, thusly:
if (x is true) {
we do y
}
However, there is one special case, namely functions: they have the
opening brace at the beginning of the next line, thus:
int function(int x)
{
body of function
}
Heretic people all over the world have claimed that this inconsistency
is ... well ... inconsistent, but all right-thinking people know that
(a) K&R are _right_ and (b) K&R are right. Besides, functions are
special anyway (you can't nest them in C).
Note that the closing brace is empty on a line of its own, _except_ in
the cases where it is followed by a continuation of the same statement,
ie a "while" in a do-statement or an "else" in an if-statement, like
this:
do {
body of do-loop
} while (condition);
and
if (x == y) {
..
} else if (x > y) {
...
} else {
....
}
Rationale: K&R.
Also, note that this brace-placement also minimizes the number of empty
(or almost empty) lines, without any loss of readability. Thus, as the
supply of new-lines on your screen is not a renewable resource (think
25-line terminal screens here), you have more empty lines to put
comments on.
Chapter 4: Naming
C is a Spartan language, and so should your naming be. Unlike Modula-2
and Pascal programmers, C programmers do not use cute names like
ThisVariableIsATemporaryCounter. A C programmer would call that
variable "tmp", which is much easier to write, and not the least more
difficult to understand.
HOWEVER, while mixed-case names are frowned upon, descriptive names for
global variables are a must. To call a global function "foo" is a
shooting offense.
GLOBAL variables (to be used only if you _really_ need them) need to
have descriptive names, as do global functions. If you have a function
that counts the number of active users, you should call that
"count_active_users()" or similar, you should _not_ call it "cntusr()".
Encoding the type of a function into the name (so-called Hungarian
notation) is brain damaged - the compiler knows the types anyway and can
check those, and it only confuses the programmer. No wonder MicroSoft
makes buggy programs.
LOCAL variable names should be short, and to the point. If you have
some random integer loop counter, it should probably be called "i".
Calling it "loop_counter" is non-productive, if there is no chance of it
being mis-understood. Similarly, "tmp" can be just about any type of
variable that is used to hold a temporary value.
If you are afraid to mix up your local variable names, you have another
problem, which is called the function-growth-hormone-imbalance syndrome.
See next chapter.
Chapter 5: Functions
Functions should be short and sweet, and do just one thing. They should
fit on one or two screenfuls of text (the ISO/ANSI screen size is 80x24,
as we all know), and do one thing and do that well.
The maximum length of a function is inversely proportional to the
complexity and indentation level of that function. So, if you have a
conceptually simple function that is just one long (but simple)
case-statement, where you have to do lots of small things for a lot of
different cases, it's OK to have a longer function.
However, if you have a complex function, and you suspect that a
less-than-gifted first-year high-school student might not even
understand what the function is all about, you should adhere to the
maximum limits all the more closely. Use helper functions with
descriptive names (you can ask the compiler to in-line them if you think
it's performance-critical, and it will probably do a better job of it
than you would have done).
Another measure of the function is the number of local variables. They
shouldn't exceed 5-10, or you're doing something wrong. Re-think the
function, and split it into smaller pieces. A human brain can
generally easily keep track of about 7 different things, anything more
and it gets confused. You know you're brilliant, but maybe you'd like
to understand what you did 2 weeks from now.
Chapter 6: Centralized exiting of functions
Albeit deprecated by some people, the equivalent of the goto statement is
used frequently by compilers in form of the unconditional jump instruction.
The goto statement comes in handy when a function exits from multiple
locations and some common work such as cleanup has to be done.
The rationale is:
- unconditional statements are easier to understand and follow
- nesting is reduced
- errors by not updating individual exit points when making
modifications are prevented
- saves the compiler work to optimize redundant code away ;)
int fun(int a)
{
int result = 0;
char *buffer = kmalloc(SIZE);
if (buffer == NULL)
return -ENOMEM;
if (condition1) {
while (loop1) {
...
}
result = 1;
goto out;
}
...
out:
kfree(buffer);
return result;
}
Chapter 7: Commenting
Comments are good, but there is also a danger of over-commenting. NEVER
try to explain HOW your code works in a comment: it's much better to
write the code so that the _working_ is obvious, and it's a waste of
time to explain badly written code.
Generally, you want your comments to tell WHAT your code does, not HOW.
Also, try to avoid putting comments inside a function body: if the
function is so complex that you need to separately comment parts of it,
you should probably go back to chapter 5 for a while. You can make
small comments to note or warn about something particularly clever (or
ugly), but try to avoid excess. Instead, put the comments at the head
of the function, telling people what it does, and possibly WHY it does
it.
When commenting the kernel API functions, please use the kerneldoc format.
See the files Documentation/kernel-doc-nano-HOWTO.txt and scripts/kernel-doc
for details.
Chapter 8: You've made a mess of it
That's OK, we all do. You've probably been told by your long-time Unix
user helper that "GNU emacs" automatically formats the C sources for
you, and you've noticed that yes, it does do that, but the defaults it
uses are less than desirable (in fact, they are worse than random
typing - an infinite number of monkeys typing into GNU emacs would never
make a good program).
So, you can either get rid of GNU emacs, or change it to use saner
values. To do the latter, you can stick the following in your .emacs file:
(defun linux-c-mode ()
"C mode with adjusted defaults for use with the Linux kernel."
(interactive)
(c-mode)
(c-set-style "K&R")
(setq tab-width 8)
(setq indent-tabs-mode t)
(setq c-basic-offset 8))
This will define the M-x linux-c-mode command. When hacking on a
module, if you put the string -*- linux-c -*- somewhere on the first
two lines, this mode will be automatically invoked. Also, you may want
to add
(setq auto-mode-alist (cons '("/usr/src/linux.*/.*\\.[ch]$" . linux-c-mode)
auto-mode-alist))
to your .emacs file if you want to have linux-c-mode switched on
automagically when you edit source files under /usr/src/linux.
But even if you fail in getting emacs to do sane formatting, not
everything is lost: use "indent".
Now, again, GNU indent has the same brain-dead settings that GNU emacs
has, which is why you need to give it a few command line options.
However, that's not too bad, because even the makers of GNU indent
recognize the authority of K&R (the GNU people aren't evil, they are
just severely misguided in this matter), so you just give indent the
options "-kr -i8" (stands for "K&R, 8 character indents"), or use
"scripts/Lindent", which indents in the latest style.
"indent" has a lot of options, and especially when it comes to comment
re-formatting you may want to take a look at the man page. But
remember: "indent" is not a fix for bad programming.
Chapter 9: Configuration-files
For configuration options (arch/xxx/Kconfig, and all the Kconfig files),
somewhat different indentation is used.
Help text is indented with 2 spaces.
if CONFIG_EXPERIMENTAL
tristate CONFIG_BOOM
default n
help
Apply nitroglycerine inside the keyboard (DANGEROUS)
bool CONFIG_CHEER
depends on CONFIG_BOOM
default y
help
Output nice messages when you explode
endif
Generally, CONFIG_EXPERIMENTAL should surround all options not considered
stable. All options that are known to trash data (experimental write-
support for file-systems, for instance) should be denoted (DANGEROUS), other
experimental options should be denoted (EXPERIMENTAL).
Chapter 10: Data structures
Data structures that have visibility outside the single-threaded
environment they are created and destroyed in should always have
reference counts. In the kernel, garbage collection doesn't exist (and
outside the kernel garbage collection is slow and inefficient), which
means that you absolutely _have_ to reference count all your uses.
Reference counting means that you can avoid locking, and allows multiple
users to have access to the data structure in parallel - and not having
to worry about the structure suddenly going away from under them just
because they slept or did something else for a while.
Note that locking is _not_ a replacement for reference counting.
Locking is used to keep data structures coherent, while reference
counting is a memory management technique. Usually both are needed, and
they are not to be confused with each other.
Many data structures can indeed have two levels of reference counting,
when there are users of different "classes". The subclass count counts
the number of subclass users, and decrements the global count just once
when the subclass count goes to zero.
Examples of this kind of "multi-level-reference-counting" can be found in
memory management ("struct mm_struct": mm_users and mm_count), and in
filesystem code ("struct super_block": s_count and s_active).
Remember: if another thread can find your data structure, and you don't
have a reference count on it, you almost certainly have a bug.
Chapter 11: Macros, Enums and RTL
Names of macros defining constants and labels in enums are capitalized.
#define CONSTANT 0x12345
Enums are preferred when defining several related constants.
CAPITALIZED macro names are appreciated but macros resembling functions
may be named in lower case.
Generally, inline functions are preferable to macros resembling functions.
Macros with multiple statements should be enclosed in a do - while block:
#define macrofun(a, b, c) \
do { \
if (a == 5) \
do_this(b, c); \
} while (0)
Things to avoid when using macros:
1) macros that affect control flow:
#define FOO(x) \
do { \
if (blah(x) < 0) \
return -EBUGGERED; \
} while(0)
is a _very_ bad idea. It looks like a function call but exits the "calling"
function; don't break the internal parsers of those who will read the code.
2) macros that depend on having a local variable with a magic name:
#define FOO(val) bar(index, val)
might look like a good thing, but it's confusing as hell when one reads the
code and it's prone to breakage from seemingly innocent changes.
3) macros with arguments that are used as l-values: FOO(x) = y; will
bite you if somebody e.g. turns FOO into an inline function.
4) forgetting about precedence: macros defining constants using expressions
must enclose the expression in parentheses. Beware of similar issues with
macros using parameters.
#define CONSTANT 0x4000
#define CONSTEXP (CONSTANT | 3)
The cpp manual deals with macros exhaustively. The gcc internals manual also
covers RTL which is used frequently with assembly language in the kernel.
Chapter 12: Printing kernel messages
Kernel developers like to be seen as literate. Do mind the spelling
of kernel messages to make a good impression. Do not use crippled
words like "dont" and use "do not" or "don't" instead.
Kernel messages do not have to be terminated with a period.
Printing numbers in parentheses (%d) adds no value and should be avoided.
Chapter 13: Allocating memory
The kernel provides the following general purpose memory allocators:
kmalloc(), kzalloc(), kcalloc(), and vmalloc(). Please refer to the API
documentation for further information about them.
The preferred form for passing a size of a struct is the following:
p = kmalloc(sizeof(*p), ...);
The alternative form where struct name is spelled out hurts readability and
introduces an opportunity for a bug when the pointer variable type is changed
but the corresponding sizeof that is passed to a memory allocator is not.
Casting the return value which is a void pointer is redundant. The conversion
from void pointer to any other pointer type is guaranteed by the C programming
language.
Chapter 14: The inline disease
There appears to be a common misperception that gcc has a magic "make me
faster" speedup option called "inline". While the use of inlines can be
appropriate (for example as a means of replacing macros, see Chapter 11), it
very often is not. Abundant use of the inline keyword leads to a much bigger
kernel, which in turn slows the system as a whole down, due to a bigger
icache footprint for the CPU and simply because there is less memory
available for the pagecache. Just think about it; a pagecache miss causes a
disk seek, which easily takes 5 miliseconds. There are a LOT of cpu cycles
that can go into these 5 miliseconds.
A reasonable rule of thumb is to not put inline at functions that have more
than 3 lines of code in them. An exception to this rule are the cases where
a parameter is known to be a compiletime constant, and as a result of this
constantness you *know* the compiler will be able to optimize most of your
function away at compile time. For a good example of this later case, see
the kmalloc() inline function.
Often people argue that adding inline to functions that are static and used
only once is always a win since there is no space tradeoff. While this is
technically correct, gcc is capable of inlining these automatically without
help, and the maintenance issue of removing the inline when a second user
appears outweighs the potential value of the hint that tells gcc to do
something it would have done anyway.
Chapter 15: References
The C Programming Language, Second Edition
by Brian W. Kernighan and Dennis M. Ritchie.
Prentice Hall, Inc., 1988.
ISBN 0-13-110362-8 (paperback), 0-13-110370-9 (hardback).
URL: http://cm.bell-labs.com/cm/cs/cbook/
The Practice of Programming
by Brian W. Kernighan and Rob Pike.
Addison-Wesley, Inc., 1999.
ISBN 0-201-61586-X.
URL: http://cm.bell-labs.com/cm/cs/tpop/
GNU manuals - where in compliance with K&R and this text - for cpp, gcc,
gcc internals and indent, all available from http://www.gnu.org/manual/
WG14 is the international standardization working group for the programming
language C, URL: http://www.open-std.org/JTC1/SC22/WG14/
Kernel CodingStyle, by greg@kroah.com at OLS 2002:
http://www.kroah.com/linux/talks/ols_2002_kernel_codingstyle_talk/html/
--
Last updated on 30 December 2005 by a community effort on LKML.
2008年4月16日 星期三
研發筆記-利用Hostname反查IP的小程式
事件:有時候只知道Hostname,但當我們要connect,或是要做一些比對的時候,卻又需要ip address,因此利用gethostbyname()來取得IP。
程式碼如下:
#include <stdio.h>
#include <unistd.h>
#include <stdlib.h>
#include <string.h>
#include <sys/socket.h>
#include <netinet/in.h>
#include <arpa/inet.h>
#include <netdb.h>
int main(int argc, char **argv)
{
int i,j;
for(i=1; i<argc; i++)
{
struct hostent *ht;
ht = gethostbyname(argv[i]);
if(!ht)
{
printf("error\n");
continue;
}
printf("host =%s\n",argv[i]);
if(ht->h_addrtype == AF_INET)
{
for(j=0;ht->h_addr_list[j];j++)
printf("address: %s\n",inet_ntoa(*(struct in_addr *)ht->h_addr_list[j]));
}
}
return 0;
}
執行範例:
[root@ATC-P4-95 UPC]# ./test2 tw.yahoo.com
host =tw.yahoo.com
address: 203.84.202.164
2008年3月25日 星期二
研發筆記-[轉]C語言新手十誡
好文分享,網路上有很多人都會收錄這篇,敝人從事Embeded System C寫作,覺得他所寫的正是我們常遇到的問題,故收錄之。
[轉]C語言新手十誡
C 語言新手十誡(The Ten Commandments for Newbie C Programmers) by Khoguan Phuann
請注意:
(1) 本篇旨在提醒新手,避免初學常犯的錯誤(其實老手也常犯:-Q)。
但不能取代完整的學習,請自己好好研讀一兩本 C 語言的好書,
並多多實作練習。
(2) 強烈建議新手先看過此文再發問,你的問題極可能此文已經提出並
解答了。
(3) 以下所舉的錯誤例子如果在你的電腦上印出和正確例子相同的結果,
那只是不足為恃的一時僥倖。
(4) 不守十誡者,輕則執行結果的輸出數據錯誤,或是程式當掉,重則
引爆核彈、毀滅地球(如果你的 C 程式是用來控制核彈發射器的話)。
一、你不可以使用尚未給予適當初值的變數。
錯誤例子:
int accumulate(int max) /* 從 1 累加到 max,傳回結果 */
{
int sum; /* 未給予初值的區域變數,其內容值是垃圾 */
int num;
for (num = 1; num <= max; num++) {
sum += num;
}
return sum;
}
正確例子:
int accumulate(int max)
{
int sum = 0; /* 正確的賦予適當的初值 */
int num;
for (num = 1; num <= max; num++) {
sum += num;
}
return sum;
}
二、你不可以存取超過陣列既定範圍的空間。
錯誤例子:
int str[5];
int i;
for (i = 0; i <= 5; i++) str[i] = i;
正確例子:
int str[5];
int i;
for (i = 0; i < 5; i++) str[i] = i;
說明:宣告陣列時,所給的陣列元素個數值如果是 N, 那麼我們在後面
透過 [索引值] 存取其元素時,所能使用的索引值範圍是從 0 到 N-1,
也就是 C 和 C++ 的陣列元素是從第 0 個開始算起,最後一個元素的
索引值是 N-1, 不是 N。
C/C++ 為了執行效率,並不會自動檢查陣列索引值是否超過陣列邊界,
我們要自己寫程式來確保不會越界。一旦越界,將導致無法預期的後果。
三、你不可以提取(dereference)不知指向何方的指標(包含 null 指標)。
錯誤例子:
char *pc1; /* 未給予初值,不知指向何方 */
char *pc2 = 0; /* pc2 起始化為 null pointer */
*pc1 = 'a'; /* 將 'a' 寫到不知何方,錯誤 */
*pc2 = 'b'; /* 將 'b' 寫到「位址0」,錯誤 */
正確例子:
char c; /* c 的內容尚未起始化 */
char *pc1 = &c; /* pc1 指向字元變數 c */
/* 動態分配 10 個 char(其值未定),並將第一個char的位址賦值給 pc2 */
char *pc2 = (char *)malloc(10);
*pc1 = 'a'; /* c 的內容變為 'a' */
pc2[0] = 'b'; /* 動態配置來的第 0 個字元,內容變為 'b'
/* 最後記得 free() 掉 malloc() 所分配的空間 */
free(pc2);
說明:指標變數必需先指向某個明確的東西(object),才能進行操作。
四、你不可以將字串常數賦值(assign)給 char* 變數,然後透過該變數
改寫字串的內容(只能讀不能寫)。
錯誤例子:
char* pc = "john";
*pc = 'J';
printf("Hello, %s\n", pc);
正確例子:
char pc[] = "john";
*pc = 'J'; /* 或 pc[0] = 'J'; */
printf("Hello, %s\n", pc);
說明:字串常數的內容是唯讀的。上面的錯誤例子,是將其內容所在的位址賦
值給字元指標 pc, 我們透過指標只可以去讀該字串常數的內容,而不應該做
寫入的動作。而正確例子,則是另外宣告一個獨立的字元陣列,它的大小我們
未明文指定([]),編譯器會自動將其設為剛好可以容納後面的字串常數起始
值的大小,包括字串後面隱含的 '\0' 字元,並將字串常數的內容複製到字元
陣列中,因此可以自由的對該字元陣列的內容進行讀和寫。
錯誤例子(2):
char *s1 = "Hello, ";
char *s2 = "world!";
/* strcat() 不會另行配置空間,只會將資料附加到 s1 所指唯讀字串的後面,
造成寫入到程式無權碰觸的記憶體空間 */
char *s3 = strcat(s1, s2);
正確例子(2):
/* s1 宣告成陣列,並保留足夠空間存放後續要附加的內容 */
char s1[20] = "Hello, ";
char *s2 = "world!";
/* 因為 strcat() 的返回值等於第一個參數值,所以 s3 就不需要了 */
strcat(s1, s2);
五、你不可以對尚未分配所指空間的 char* 變數,進行(字串)陣列的相關操作。
其他型別的指標亦然。
錯誤例子:
char *name; /* name 尚未指向有效的空間 */
printf("Your name, please: ");
gets(name);
printf("Hello, %s\n", name);
正確例子(1):
/* 如果編譯期就能決定字串的最大空間,那就不要宣告成 char* 改用 char[] */
char name[21]; /* 字串最長 20 個字元,另加一個 '\0' */
printf("Your name, please: ");
gets(name);
printf("Hello, %s\n", name);
正確例子(2):
/* 若是在執行時期才能決定字串的最大空間,則需利用 malloc() 函式來動態
分配空間 */
size_t length;
char *name;
printf("請輸入字串的最大長度(含null字元): ");
scanf("%u", &length);
name = (char *)malloc(length);
printf("Your name, please: ");
scanf("%s", name);
printf("Hello, %s\n", name);
/* 最後記得 free() 掉 malloc() 所分配的空間 */
free(name);
注意:上例用 gets() 或 scanf() 來讀入字串,是不安全的。 因為這些函式
不會幫我們檢查使用者所輸入的字串長度是否超過我們所分配的 buffer 空間,
很可能會發生 buffer overflow。比較安全的做法是用 fgets() 來取代。如:
char *p;
char name[21];
printf("Your name, please: ");
fgets(name, sizeof(name), stdin);
/* fgets()會連行末的'\n'也讀進字串中,所以要找出存入'\n'的位置,填入 '\0'
if ((p = strchr(name, '\n')) != NULL)
*p = '\0';
printf("Hello, %s\n", name);
六、你不可以在函式中回傳一個指向區域性自動變數的指標。否則,會得到垃圾值。
[感謝 gocpp 網友提供程式例子]
錯誤例子:
char *getstr(char *name)
{
char buf[30] = "hello, "; /*將字串常數"hello, "的內容複製到buf陣列*/
strcat(buf, name);
return buf;
}
說明:區域性自動變數,將會在離開該區域時(本例中就是從getstr函式返回時)
被消滅,因此呼叫端得到的指標所指的字串內容就失效了。【不過,倒是可以從
函式中直接傳回字串常數,賦值給呼叫端的一個 const char * 變數,它既是唯
讀的(參見第四誡),同時也具有恒常的儲存期(static storage duration),其
內容將一直有效。】
正確例子:
void getstr(char buf[], int buflen, char const *name)
{
char const s[] = "hello, ";
assert(strlen(s) + strlen(name) < buflen);
strcpy(buf, s);
strcat(buf, name);
}
[針對字串操作,C++提供了更方便安全的 string class, 能用就盡量用]
#include
using std::string;
string getstr(string const &name)
{
return string("hello, ") += name;
}
七、你不可以只做 malloc(), 而不做相應的 free(). 否則會造成記憶體漏失。
但若不是用 malloc() 所得到的記憶體,則不可以 free()。已經 free()了
所指記憶體的指標,在它指向另一塊有效的動態分配得來的空間之前,不可
以再被 free(),也不可以提取(dereference)這個指標。
[C++] 你不可以只做 new, 而不做相應的 delete.
八、你不可以在數值運算、賦值或比較中隨意混用不同型別的數值,而不謹慎考
慮數值型別轉換可能帶來的「意外驚喜」(錯愕)。必須隨時注意數值運算
的結果,其範圍是否會超出變數的型別。
錯誤例子(1):
unsigned int sum = 2000000000 + 2000000000; /* 20 億 */
double f = 10 / 3;
正確例子(1):
/* 全部都用 unsigned int, 注意數字後面的 u, 大寫 U 也成 */
unsigned int sum = 2000000000u + 2000000000u;
/* 或是用顯式的轉型 */
unsigned int sum = (unsigned int)2000000000 + 2000000000;
double f = 10.0 / 3.0;
說明:在目前最普遍的32位元PC作業平台上,整數常數2000000000的型別為
signed int(簡寫為 int),相加後,其結果仍為 int, 但是 signed int
放不下 4000000000, 造成算術溢位(arithmetic overflow),很可能無法
將正確的值指派給 unsigned int sum,縱使 unsigned int 放得下4000000000
的數值。注意:寫成
unsigned int sum = (unsigned int)(2000000000 + 2000000000);
也是不對的。
例子(2):(感謝 sekya 網友提供)
unsigned char a = 0x80;
char b = 0x80; /* implementation-defined result */
if( a == 0x80 ) { /* 恒真 */
printf( "a ok\n" );
if( b == 0x80 ) { /* 不一定恒真 */
printf( "b ok\n" );
}
說明:在將 char 型別定義為範圍從 -128 至 +127 的系統上,int 0x80
(其值等於 +128)要轉成 char 會放不下,會產生編譯器自行定義的值。
這樣的程式就不具可移植性了。
九、你不可以在一個運算式(expression)中,對一個基本型態的變數修改其值
超過一次以上。否則,將導致未定義的行為(undefined behavior)。
錯誤例子:
int i = 7;
int j = ++i + i++;
正確例子:
int i = 7;
int j = ++i;
j += i++;
你也不可以在一個運算式(expression)中,對一個基本型態的變數修改其值,
而且還在同一個式子的其他地方為了其他目的而存取該變數的值。(其他目的,
是指不是為了計算這個變數的新值的目的)。否則,將導致未定義的行為。
錯誤例子:
int arr[5];
int i = 0;
arr[i] = i++;
正確例子:
int arr[5];
int i = 0;
arr[i] = i;
i++;
[C++程式]
錯誤例子:
int i = 10;
cout << i << "==" << i++;
正確例子:
int i = 10;
cout << i << "==";
cout << i++;
十、你不可以在macro的定義中,不為它的參數個別加上括號。
錯誤例子:
#include
#define SQUARE(x) (x * x)
int main()
{
printf("%d\n", SQUARE(10-5));
return 0;
}
正確例子:
#include
#define SQUARE(x) ((x) * (x))
int main()
{
printf("%d\n", SQUARE(10-5));
return 0;
}
說明:如果是用 C++, 請多多利用 inline function 來取代上述的 macro,
以免除 macro 定義的種種危險性。如:
inline int square(int x) { return x * x; }
macro 定義出的「偽函式」至少缺乏下列數項函式本有的能力:
(1) 無法進行參數型別的檢查。
(2) 無法遞迴呼叫。
(3) 無法用 & 加在 macro name 之前,取得函式位址。
(4) 呼叫時往往不能使用具有 side effect 的引數。例如:
錯誤例子:(感謝 yaca 網友提供)
#define MACRO(x) (((x) * (x)) - ((x) * (x)))
int main()
{
int x = 3;
printf("%d\n", MACRO(++x));
return 0;
}
MACRO(++x) 展開來後變成 (((++x) * (++x)) - ((++x) * (++x)))
違反了第九誡。在 gcc 4.3.3 下的結果是 -24, 在 vc++ 下是 0.
後記:從「古時候」流傳下來一篇文章
"The Ten Commandments for C Programmers"(Annotated Edition)
by Henry Spencer
http://www.lysator.liu.se/c/ten-commandments.html
一方面它不是針對 C 的初學者,一方面它特意模仿中古英文
聖經的用語,寫得文謅謅。所以我現在另外寫了這篇,希望
能涵蓋最重要的觀念以及初學甚至老手最易犯的錯誤。
作者:潘科元(Khoguan Phuann) (c)2005. 感謝 ptt.cc BBS 的 C_and_CPP
看板眾多網友提供寶貴意見及程式實例。
研發筆記-HTML的轉換字元
常常看到網頁會出現一些希奇古怪的字元,如 ,這個東西其實就是空白字元,理論上一個Browser(IE,Netscape,Firefox...)在呈現一個網頁的時候,就會把這個字元轉成空白,甚至有些時候,當我們在網頁上寫表格想要顯示空白的時候,也必須填 ,這樣Browser在呈現的時候才會正常。
以下就列出一些常用的轉換字元:
"Special" Character Entity References
< - < > - >
& - & " - "
HTML 2.0 Standard Entity References
Æ - Æ Á - Á Â - Â
À - À Å - Å Ã - Ã
Ä - Ä Ç - Ç Ð - Ð
É - É Ê - Ê È - È
Ë - Ë Í - Í Î - Î
Ì - Ì Ï - Ï Ñ - Ñ
Ó - Ó Ô - Ô Ò - Ò
Ø - Ø Õ - Õ Ö - Ö
Þ - Þ Ú - Ú Û - Û
Ù - Ù Ü - Ü Ý - Ý
á - á â - â æ - æ
à - à å - å ã - ã
ä - ä ç - ç é - é
ê - ê è - è ð - ð
ë - ë í - í î - î
ì - ì ï - ï ñ - ñ
ó - ó ô - ô ò - ò
ø - ø õ - õ ö - ö
ß - ß þ - þ ú - ú
û - û ù - ù ü - ü
ý - ý ÿ - ÿ
Entities Added with HTML 3.2
- ¡ - ¡ £ - £
¤ - ¤ ¥ - ¥ ¦ - ¦
§ - § ¨ - ¨ © - ©
ª - ª « - « ¬ - ¬
­ - ® - ® ¯ - ¯
° - ° ± - ± ² - ²
³ - ³ ´ - ´ µ - µ
¶ - ¶ · - · ¸ - ¸
¹ - ¹ º - º » - »
¼ - ¼ ½ - ½ ¾ - ¾
¿ - ¿
Additional Widely Implemented Entities
× - ×
÷ - ÷
¢ - ¢
2008年2月27日 星期三
研發筆記-如何避免client端不正常斷線,導致segmentation fault.
事件:
在設計UPNP media server的時候,同時間有很多client來request,因此不正常斷線的情形必定發生。
原因:
For avoiding the clients send TCP RESET flag, (the same request in a short period) to close the connection, and make our daemon write into a broken connection, then the SIGPIPE will cause an unexpected exit.
解法:在main()註冊一個 signal(SIGPIPE, SIG_IGN);
