Thursday, June 23, 2005
Runaway Daemon
Not another Julia movie but this is a disaster! Running MySQL 5 out of the box with default setting, this code means sudden death (both CPU and memory resources!):
Suggestions:
Implement a time-limit mechanism, which the administrator can enforce
Limited memory space (again, administrator enforced)
DROP FUNCTION IF EXISTS runaway;
delimiter //
CREATE FUNCTION runaway () RETURNS char(255)
BEGIN
DECLARE result CHAR(255);
SET result = "";
maddy: LOOP
SET result = CONCAT(result,'a');
ITERATE maddy;
END LOOP maddy;
RETURN result;
END
//
Suggestions:
Using the LOOP construct
Here's a version using the LOOP construct:
DROP FUNCTION encrypto;
delimiter //
CREATE FUNCTION encrypto (s CHAR(255)) RETURNS char(255)
BEGIN
DECLARE mylen INT;
DECLARE cnt INT;
DECLARE result CHAR(255);
SET result = "";
SET mylen = LENGTH(s); -- oohh, this hurts multibyte!
SET cnt = 1; -- SUBSTRING starts from index 1!
crypto: LOOP
SET result = CONCAT(result,CHAR(ORD(SUBSTRING(s,cnt,1))+1));
SET cnt = cnt + 1;
IF cnt > mylen THEN LEAVE crypto; ELSE ITERATE crypto; END IF;
END LOOP crypto;
RETURN result;
END
//
First take at the new MySQL Functions
The functionality we produced through MySQL source hacking can be done with a following User Defined Function (UDF). It would be interesting to see the performance characteristics. I broke the pins on my HDD (which has Linux installed) so it would take a while to upload the benchmarks. The compiled version will defintely be much more faster but there is a tradeoff, UDFs can be created and destroyed on the fly!
DROP FUNCTION IF EXISTS encrypto2;
delimiter //
CREATE FUNCTION encrypto2 (s CHAR(255)) RETURNS char(255)
BEGIN
DECLARE mylen INT;
DECLARE cnt INT;
DECLARE result CHAR(255);
SET result = "";
SET mylen = LENGTH(s); -- oohh, this hurts multibyte!
SET cnt = 1; -- SUBSTRING starts from index 1!
WHILE mylen >= cnt DO
SET result = CONCAT(result,CHAR(ORD(SUBSTRING(s,cnt,1))+1));
SET cnt = cnt + 1;
END WHILE;
RETURN result;
END
//
Sunday, June 19, 2005
Guide: Adding a new function to MySQL core
The following diffs show you how to add a new function called 'whizkid;. We'll construct a simple function called 'whizkid' which will convert any string type into an encrypted type by adding '1' to the ascii code (we'll not worry about adding 1 to character with code 255). This particular example does not consider the multibyte option available in mysql and is for simple demo only. Five files had to be modified to add this function, namely: sql/item_create.cc, sql/item_create.h, sql/item_strfunc.cc, sql/item_strfunc.h, sql/lex.h. Their unified diff outputs are given below:
After those changes, you can use 'whizkid' as a single argument function anywhere. For example:
Yay! We got it working fine. Lets dissect each of the files to see what exactly we did!
item_create.h: We created a function declaration which will instantiate the object which contains our function.
item_create.cc: The proper definition of the function which actually returns a new object of class 'Item_func_whizkid', which takes one parameter.
str_func.h: We added a class which inherits from the base class Item_str_func (all functions returning strings should inherit this class). The func_name() function returns the name of the function itself. I tried changing it for another function and it has to match with the one defined in lex.h (more on it below). Furthermore, we have two functions in this class which we define in str_func.cc.
str_func.cc: val_str gives us the string equivalent in the variable 'res' of the argument. I copied this particular code snippet from the 'reverse' function and didn't bother to optimize it much. Basically, we allocate memory for the result first and then create a new instance of the string with the +1 encryption. The other function fix_length_and_dec tells MySQL about the length of the string this function will spit out. In our case, the max_length is the max_length of the argument itself as we are not changing the size (I'll have more to say on this later maybe).
lex.h: The new function is added into the structure which registers the function as a function which takes one argument and which can be called by a canonical name of 'whizkid'. Later gen_lex_hash generates hashes of all the functions and output it in lex_hash.h file, which uses lex.h to lookup the 'whizkid' function for the registration into the symbol table. The gen_lex_hash ensures that the table is correctly formed with no repeated entries etcetera.
Just make those changes, compile the server and you have an new function called 'whizkid'!
--- item_create.h 2005/06/19 18:48:04 1.1
+++ item_create.h 2005/06/19 18:53:37
@@ -77,6 +77,7 @@
Item *create_func_release_lock(Item* a);
Item *create_func_repeat(Item* a, Item *b);
Item *create_func_reverse(Item* a);
+Item *create_func_whizkid(Item* a);
Item *create_func_rpad(Item* a, Item *b, Item *c);
Item *create_func_rtrim(Item* a);
Item *create_func_sec_to_time(Item* a);
--- item_create.cc 2005/06/19 18:48:04 1.1
+++ item_create.cc 2005/06/19 19:43:13
@@ -332,6 +332,11 @@
return new Item_func_reverse(a);
}
+Item *create_func_whizkid(Item* a)
+{
+ return new Item_func_whizkid(a);
+}
+
Item *create_func_rpad(Item* a, Item *b, Item *c)
{
return new Item_func_rpad(a,b,c);
--- item_strfunc.h 2005/06/19 18:48:04 1.1
+++ item_strfunc.h 2005/06/19 19:39:39
@@ -106,6 +106,14 @@
const char *func_name() const { return "reverse"; }
};
+class Item_func_whizkid :public Item_str_func
+{
+public:
+ Item_func_whizkid (Item *a) :Item_str_func(a) {}
+ String *val_str(String *);
+ void fix_length_and_dec();
+ const char *func_name() const { return "whizkid"; }
+};
class Item_func_replace :public Item_str_func
{
--- item_strfunc.cc 2005/06/19 18:48:04 1.1
+++ item_strfunc.cc 2005/06/19 19:44:28
@@ -702,6 +702,43 @@
max_length = args[0]->max_length;
}
+String *Item_func_whizkid::val_str(String *str)
+{
+ DBUG_ASSERT(fixed == 1);
+ String *res = args[0]->val_str(str);
+ char *ptr,*end;
+
+ if ((null_value=args[0]->null_value))
+ return 0;
+
+ // retun if we have a zero length entity
+ if (!res->length())
+ return &my_empty_string;
+
+ res=copy_if_not_alloced(str,res,res->length());
+ ptr = (char *) res->ptr();
+ end=ptr+res->length();
+
+ // ignoring multibyte
+ {
+ char tmp;
+ while (ptr < end)
+ {
+ *ptr += 1;
+ ptr++;
+ }
+ }
+ return res;
+}
+
+
+void Item_func_whizkid::fix_length_and_dec()
+{
+ collation.set(args[0]->collation);
+ max_length = args[0]->max_length;
+}
+
+
/*
** Replace all occurences of string2 in string1 with string3.
** Don't reallocate val_str() if not needed
--- lex.h 2005/06/19 18:48:04 1.1
+++ lex.h 2005/06/19 18:53:59
@@ -717,6 +717,7 @@
{ "RAND", SYM(RAND)},
{ "RELEASE_LOCK", F_SYM(FUNC_ARG1),0,CREATE_FUNC(create_func_release_lock)},
{ "REVERSE", F_SYM(FUNC_ARG1),0,CREATE_FUNC(create_func_reverse)},
+ { "WHIZKID", F_SYM(FUNC_ARG1),0,CREATE_FUNC(create_func_whizkid)},
{ "ROUND", SYM(ROUND)},
{ "ROW_COUNT", SYM(ROW_COUNT_SYM)},
{ "RPAD", F_SYM(FUNC_ARG3),0,CREATE_FUNC(create_func_rpad)},
After those changes, you can use 'whizkid' as a single argument function anywhere. For example:
> SELECT whizkid('bumppp');
+-------------------+
| whizkid('bumppp') |
+-------------------+
| cvnqqq |
+-------------------+
1 row in set (0.00 sec)
Yay! We got it working fine. Lets dissect each of the files to see what exactly we did!
item_create.h: We created a function declaration which will instantiate the object which contains our function.
item_create.cc: The proper definition of the function which actually returns a new object of class 'Item_func_whizkid', which takes one parameter.
str_func.h: We added a class which inherits from the base class Item_str_func (all functions returning strings should inherit this class). The func_name() function returns the name of the function itself. I tried changing it for another function and it has to match with the one defined in lex.h (more on it below). Furthermore, we have two functions in this class which we define in str_func.cc.
str_func.cc: val_str gives us the string equivalent in the variable 'res' of the argument. I copied this particular code snippet from the 'reverse' function and didn't bother to optimize it much. Basically, we allocate memory for the result first and then create a new instance of the string with the +1 encryption. The other function fix_length_and_dec tells MySQL about the length of the string this function will spit out. In our case, the max_length is the max_length of the argument itself as we are not changing the size (I'll have more to say on this later maybe).
lex.h: The new function is added into the structure which registers the function as a function which takes one argument and which can be called by a canonical name of 'whizkid'. Later gen_lex_hash generates hashes of all the functions and output it in lex_hash.h file, which uses lex.h to lookup the 'whizkid' function for the registration into the symbol table. The gen_lex_hash ensures that the table is correctly formed with no repeated entries etcetera.
Just make those changes, compile the server and you have an new function called 'whizkid'!
Moving to Linux
I was trying to find the yacc grammar for mysql and I realized that the default Windows project distributed on mysql does not come with the yacc grammar with it. I am moving my development onto my Linux box running Slackware 10.0 and the hardware is a Pentium 4 2Ghz machine with 256MB RAM. I have just added a new function to mysql called 'whizkid' and I'm compiling the server now. I'll post step-by-step instructions on adding your own functions into the mysql core.
Saturday, June 18, 2005
First try on Win32
I thought I'd take a crack at my Windows box first! This is a centrino 1.4GHZ with 512MB RAM running Windows XP Home Edition with SP2, running Visual Studio Beta 2005. I was able to compile MySQL 5.0.7 Beta both under debug and release. The client tools keep on crashing so I'm using the pre-compiled tools from v4.1 already installed on my machine. I'll tweak compiler settings to see if the client tools will work. I'm going to try to add a bunch of functions in the mysql core and will try to make them work. I'll post the diff outputs here.
Fun begins!
I didn't take part in Google's Summer Code so I decided I'll hack MySQL this summer. We have lots of assignments and projects in the summer so it was kind of tough to take up the Google Summer Code Challenge! There's a robotics competition in August and I'm hoping to take part in that as well. I'm going to take a crack at MySQL 5 internals this summer and I'll post all my findings right here on this blog! There's a really neat book, t-shirt and mug offer from MySQL and I hope to win them all!