[Freeswitch-users] Some help with my post-paid billing project
Michael Giagnocavo
mgg at giagnocavo.net
Wed Oct 14 05:43:32 PDT 2009
Wouldn’t that be SQL “IN” instead of LIKE?
First off, see if you can do this in memory. If it’s just a “rate sheet” list of prefixes + billing, it’s probably not that much data. I’d guess maybe 64-128 bytes per record (in Ruby it’s probably more, maybe 256 bytes?) inside a hashtable. Even so, we’re only talking about 1.5 to 3GB of RAM. Stick it in a nice performing dictionary of some sort and you’re set.
Just as a quick test, on one core of a Core2, it takes about 150ms to do 100K lookups against a 1M item hashtable (F#) – that’s just a 1 line loop I tried out with no optimization, doing about 650K/sec. For comparison on the same machine, looping on a simple “SELECT @@VERSION” command only achieved 5000 executions/second on a single thread – that’s just the SQL overhead (to a local SQL instance!).
If the memory usage is too excessive, consider caching only subsets of the prefixes. Keep track of which leading prefix (1 or 2 digits) are the most used, and keep those in memory. Have a reasonably fast DB to fall back to.
If you’re sure the storage requirements or the change frequency truly eliminates keeping it in your own memory, here are some suggestions for working on DB performance:
A little perf gain is to pre-calculate possible lengths, so you’re performing less lookups. Just have a table based on the first 2 or 3 digits, giving you the valid prefix lengths. That can easily cut down the number of lookups from 10-12 to, say, 3-4. Compare the perf of temp tables, CTEs, and dynamic SQL “IN”.
But I would be hesitant to do this from a higher level ORM or via dynamic SQL queries:
- You’ll have a much harder time doing DB-specific optimizations/hints
- Dynamic SQL needs to be parsed each time (possibly minor issue)
- If the ad hoc queries are not properly parameterized, query plans might not get reused very well
- You can easily send many times more data to the server (your entire SQL, versus just the sproc name + parameters)
Really look into doing programmability on the server. There’s all sorts of things you can do to minimize what you need to send to the database, and let the DB engine figure out the most efficient way. You might want to use a product that’s a bit more advanced than MySQL.
And finally, test, test, test. You should have a nice benchmark with several million rows and be able to execute, say, 10-50K lookups (on multiple threads) after each change. Modify your query, execute it, _inspect the execution plan_. Check your indexes – make sure they are covering the query in the right way. (For example, if the table is (ID, Prefix, Price), you want an index of (Prefix, Price).) Review possible hints (for example, SQL Server usually needs a hint to choose a indexed view). Make sure your transaction level is set appropriately (can you do the query without locking?). Can you batch up several numbers in one command, saving on the per-command overhead?
Also consider it in light of the rest of your application. If everything is in the same database, then you can probably do the entire rate call/save cdr/update balances in a single command and transaction.
For us, our initial draft of routing inside SQL Server performed well under 100 calls/sec – not usable. After a few days of playing with things, it went over 1000 calls/sec, which is comfortable enough for a $500 server. This is matching 100 gateways and dialplans with several thousand entries per dialplan, across 4 million+ routes, doing QBR/LCR in the process.
-Michael
From: freeswitch-users-bounces at lists.freeswitch.org [mailto:freeswitch-users-bounces at lists.freeswitch.org] On Behalf Of Muhammad Shahzad
Sent: Wednesday, October 14, 2009 1:55 AM
To: freeswitch-users at lists.freeswitch.org
Subject: Re: [Freeswitch-users] Some help with my post-paid billing project
I fully agree that direct matching is much faster then pattern matching in SQL.
One of my clients had same problem, he had around 12 million number prefixes in a table and during each call an AGI script use to query that table to find longest prefix match, but this use to take like 3-5 seconds even with indexed columns. So after a lots of R & D we come up with following logic,
1. We break the destination number length-wise, e.g. suppose number is 923344224088 then length chunks would be,
923344224088, 92334422408, 9233442240, 923344224, 92334422, 9233442, 923344, 92334, 9233, 923, 92, 9
2. Then use SQL LIKE function in WHERE clause (you can also use SQL OR function if your DBMS doesn't support SQL LIKE function), and pass all these chunks to it, e.g.
WHERE prefix LIKE (923344224088, 92334422408, 9233442240, 923344224, 92334422, 9233442, 923344, 92334, 9233, 923, 92, 9)
3. Lastly we ORDER the result by prefix length, e.g.
ORDER BY LENGTH(prefix) DESC LIMIT 1
4. The complete query will be,
SELECT * FROM prefixes
WHERE prefix LIKE (923344224088, 92334422408, 9233442240, 923344224, 92334422, 9233442, 923344, 92334, 9233, 923, 92, 9)
ORDER BY LENGTH(prefix) DESC LIMIT 1
Now the query takes less then 150 ms to execute. :-)
Here is an STL method that can generate this query, i am sure you can convert it to any programming language of your choice easily.
=========================================================
std::string GetQuery(std::string destination) {
std::string query = "SELECT * FROM prefixes WHERE prefix LIKE ('" + destination;
for(int i=1; i<destination.length(); i++) {
query += "','" + destination.substr(0, (i * -1));
}
query += "') ORDER BY LENGTH(prefix) DESC LIMIT 1";
return query;
}
=========================================================
I am pretty sure this query is 100% ANSI SQL compatible (http://en.wikipedia.org/wiki/SQL).
Thank you.
On Wed, Oct 14, 2009 at 10:15 AM, Michael Giagnocavo <mgg at giagnocavo.net<mailto:mgg at giagnocavo.net>> wrote:
In our testing with SQL Server, we found that executing several queries for direct matches yielded far better performance than one query trying to check prefixes. (The column was also part of the clustered index, but AFAIK MySQL doesn’t support defining your own clustered indexes; you get the PK always.)
-Michael
From: freeswitch-users-bounces at lists.freeswitch.org<mailto:freeswitch-users-bounces at lists.freeswitch.org> [mailto:freeswitch-users-bounces at lists.freeswitch.org<mailto:freeswitch-users-bounces at lists.freeswitch.org>] On Behalf Of Diego Viola
Sent: Tuesday, October 13, 2009 7:54 PM
To: freeswitch-users at lists.freeswitch.org<mailto:freeswitch-users at lists.freeswitch.org>
Subject: Re: [Freeswitch-users] Some help with my post-paid billing project
Wrong question.
Is there a way to compare numbers with prefixes without using the prefix module?
Diego
On Wed, Oct 14, 2009 at 1:36 AM, Diego Viola <diego.viola at gmail.com<mailto:diego.viola at gmail.com>> wrote:
I'm using MySQL now but I will try PostgreSQL with the prefix module, is there a way to do that without the prefix module and with regular SQL?
Any examples?
Diego
On Tue, Oct 13, 2009 at 10:45 PM, Even André Fiskvik <grevenx at me.com<mailto:grevenx at me.com>> wrote:
What database are you using?
You could do this with regular SQL, but it would by a costly operation,
for PostgreSQL we're using the prefix module: http://pgfoundry.org/projects/prefix/
You can then match the closest prefix by using something like
"WHERE myprefix_col @> caller_destination_number ORDER BY LENGTH
(myprefix_col::text) LIMIT 1;"
Best regards,
Even André
On 13. okt. 2009, at 23.53, Diego Viola wrote:
> Hello,
>
> I'm trying to write a post-paid billing script, I have the CDR on my
> database and also a "rates" table, the CDR contains fields like
> caller_destination_number, variable_duration, etc. and the rates
> table contains: destination, prefix, rate (cost).
>
> The problem is that I can't just strip the destination number to
> take the prefix from it because I have to deal with destination
> numbers from different countries and they all have different prefix
> lengths... so I need to find another way to take the prefix from the
> destination number.
>
> Any ideas how to do this?
>
> Thanks,
>
> Diego
>
> _______________________________________________
> FreeSWITCH-users mailing list
> FreeSWITCH-users at lists.freeswitch.org<mailto:FreeSWITCH-users at lists.freeswitch.org>
> http://lists.freeswitch.org/mailman/listinfo/freeswitch-users
> UNSUBSCRIBE:http://lists.freeswitch.org/mailman/options/freeswitch-
> users
> http://www.freeswitch.org
_______________________________________________
FreeSWITCH-users mailing list
FreeSWITCH-users at lists.freeswitch.org<mailto:FreeSWITCH-users at lists.freeswitch.org>
http://lists.freeswitch.org/mailman/listinfo/freeswitch-users
UNSUBSCRIBE:http://lists.freeswitch.org/mailman/options/freeswitch-users
http://www.freeswitch.org
_______________________________________________
FreeSWITCH-users mailing list
FreeSWITCH-users at lists.freeswitch.org<mailto:FreeSWITCH-users at lists.freeswitch.org>
http://lists.freeswitch.org/mailman/listinfo/freeswitch-users
UNSUBSCRIBE:http://lists.freeswitch.org/mailman/options/freeswitch-users
http://www.freeswitch.org
--
________________________________________________________
| |
| FATAL ERROR --- O X |
|_______________________________________________________|
| You have moved the mouse. |
| Windows must be restarted for the changes to take effect. |
| <OK> |
####################################/
Muhammad Shahzad
-----------------------------------
CISCO Rich Media Communication Specialist (CRMCS)
CISCO Certified Network Associate (CCNA)
Cell: +92 334 422 40 88
MSN: shari_786pk at hotmail.com<mailto:shari_786pk at hotmail.com>
Email: shaheryarkh at googlemail.com<mailto:shaheryarkh at googlemail.com>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.freeswitch.org/pipermail/freeswitch-users/attachments/20091014/900f94f1/attachment-0002.html
More information about the FreeSWITCH-users
mailing list