Problem: For any project for any kind of order processing in BPEL, developer develops multiple BPEL process to perform business process functionality, so that these business process can be reused in other project too.
But I personally seem a situation in production in multiple implementations at different places where order processing get stuck due to message lost scenario. e.g. Once BPEL(A) process will say I have responded but another BPEL(B) process which is waiting to get the response from BPEL A still waiting.
Solution:
Fusion Developer may find many post or blogs where they will get the solution to deal the orders which got stuck in BPEL engine due to message lost scenario.
Out of box BPEL provide an feature to recover a callback message using this path at BPEL Console. BPELConsole>>Administration>>Recover(Callback)
In reality finding a lost message for an order in this GUI will take lots of time coz there is no rich searching option available on this webpage. An operation person who support failed order for Production he/she just want to find out an lost message based upon order id, so that he/she can resubmit that transaction for stuck order but due to limited searching capability in this GUI its very hard to do that. The only search option available to select BPEL process name which is not much useful at all. Also I am sure in many production environment the list of message in this console is very huge, so user can't resubmit everything from this queue.
In best of my knowledge I personally knew few scenarios where an asynchronous BPEL process is waiting for some response from external system and external system already sent the response and corresponding BPEL got resumed based on correlation id and BPEL process is completed successfully but external system keep sending the expected response for same order id using same correlation id again and again, the incoming subsequent message which coming to bpel engine won't get processed since first response for that order id is already processed when second response comes it sit here in this queue and message in this queue get piled up. Its a kind of defect and depend on environment to environment.
Many times due to bulk orders resubmission in BPEL engine many orders get stuck during processing coz delivery thread won’t be able to process all the orders. Message will come and will sit in dlv_message table with the undelivered status and never get picked up. When Customer escalate about his/her orders then Failed order member look into that order and realized this order is the victim of message lost scenario.
To fix this problem there are two ways once way to resubmit all the message from recovery queue but that again will cause the performance issue on production or other option to re trigger the stuck order from BPEL console again.
In case of re trigger the order from BPEL console, first requirement comes to get the list of stuck order which is not easy to get from Cube_Instance table. The reason behind is Cube_Instance table store state value 1 for Running/Waiting/Failed/Stuck instance but user need only stuck orders.
For all above four different state at database level instance value will always be 1 only. Now if any DB user even pull out the instance list select * from orabpel.cube_instance where state=1 will not give the list of stuck orders.
To get the final list of stuck order what need to do that’s what I want to cover here.
In Fusion an order either could be Running/Failed/Waiting/Faulted/Stuck/Aborted/Stale state.
If user will pull out all the orders list from cube_instance where state =1 then all running instance will come out then out of that subtract the orders which are failed as per BAU and has created activity in work_item table. Once order get failed it create an activity in BPEL which can be viewed using Activities console and data stored in Work_item table. Just subtract the list of instance from work_item table which have pending activity using this query select * from orabpel.work_item where state=13. Then finally subtract the list of instance which are waiting for response from external system or from other bpel process and an subscription is written out in dlv_subscription table. User can get the list of waiting instance from dlv_subscription table using this query select * from orabpel.dlv_subscription where state !=1
The final resulted CIKEY will be the list stuck order, based upon index setting as per user project final result can be refined. e.g if BPEL process is setting up order number an index in CI_indexes table then result can be refined using Index_1 column.
Here is the query which will give the stuck orders in orabpel schema.
select cbi.cikey,cbi.process_id, cbi.creation_date, cbi.modify_date,cbi.state, cbi.status, cbi.title,ci.index_1
from orabpel.cube_instance cbi, orabpel.ci_indexes ci,
(select cikey from
(select cikey from orabpel.cube_instance where state =1
minus
select cikey from orabpel.work_item where state =13 ) firstSub
minus
select cikey from orabpel.dlv_subscription where state !=1) secondSub
where secondSub.Cikey=cbi.cikey
and cbi.cikey=ci.cikey and trunc(cbi.modify_date) <= to_timestamp('2011-09-12 23:59:59','YYYY-MM-DD HH24:MI:SS')
from orabpel.cube_instance cbi, orabpel.ci_indexes ci,
(select cikey from
(select cikey from orabpel.cube_instance where state =1
minus
select cikey from orabpel.work_item where state =13 ) firstSub
minus
select cikey from orabpel.dlv_subscription where state !=1) secondSub
where secondSub.Cikey=cbi.cikey
and cbi.cikey=ci.cikey and trunc(cbi.modify_date) <= to_timestamp('2011-09-12 23:59:59','YYYY-MM-DD HH24:MI:SS')
The above query might need some changes based up on environment and project design. This query will be helpful for the projects where index_1 has been utilized to store order number. Also modify_date need to be adjusted based upon requirement. Or some db expert might can refine tune the query for best performance.
The objective or this post to give idea how to pull out the list of stuck orders since orabpel.cube_instance table state column won’t store any separate state for stuck orders. State=1 represent multiple state as result it was not easy to pull out only stuck orders list.
No comments:
Post a Comment